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Abstract:  Results  obtained  with  the  grant  fall  into  several  groupings:  (a)  Combinatorial 
conditions  on  the  graphical  representation  of  a  two-dimensional  sensor  network  that  will 
guarantee  localizability  of  the  network  in  the  event  of  loss  of  any  p  sensors  and/or  q  links  in 
the  network,  for  nonnegative  integers  p  and  q;  (b)  analysis  of  the  effects  of  measurement 
error  on  the  quality  of  localization  of  sensor  positions  in  a  sensor  network,  or  more  generally 
a  target  being  localized;  (c)  the  derivation  of  a  measure,  including  algorithms  for 
computing  it,  of  the  quality  of  connectivity  of  a  network  modeled  by  a  graph  with  nodes  and 
links,  and  in  which  the  individual  links  are  operative  with  defined  a  priori  probabilities,  and 
the  probability  that  any  one  link  is  operative  is  independent  of  the  probability  that  any  other 
link  is  operative;  (d)  connectivity  and  capacity  of  networks  with  randomly  positioned  nodes 
and  probabilistic  channel  models;  (e)  Doppler  localization  problems  and  miscellaneous 
multiagent  problems 

Introduction 

Background  to  the  problems  considered 

The  applications  context  of  the  work  is  sensor  networks  in  the  first  instance.  A  sensor 
network  comprises  a  collection  of  sensors,  some  of  which  are  at  known  positions  due  to  a 
priori  measurement  or  the  equipping  of  such  sensors  with  GPS,  while  others  are  at  unknown 
positions.  Sensors  without  positioning  capability  however  can  measure  distances  to  other 
sensors  provided  these  other  sensors  are  sufficiently  close  (‘within  the  sensing  radius’). 
(Such  measurements  are  distinct  from  those  which  the  sensor  network  is  meant  to  collect, 
which  can  relate  to  intruders,  fire,  water,  chemicals,  etc).  Sensor  network  localization  is  the 
task  of  determining  from  all  this  data  the  positions  of  all  the  sensors.  (Ability  to  do  this  is  an 
obviously  desirable  feature  if  the  collected  information  relating  to  intruders,  fire,  etc  is  to  be 
of  use.)  A  sensor  network  is  termed  localizable  when  such  determination  is  possible. 

Localizability  is  not  the  same  concept  as  localization;  localizability  is  a  prerequisite  property 


Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 
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required  prior  to  localization.  The  Cl  in  conjunction  with  others  developed  a  graphical 
(combinatorial)  characterization  of  the  localizability  property  that  was  published  several 
years  ago.  Extensions  of  that  work  also  included  identification  of  classes  of  network  for 
which  localization  could  be  easily  achieved  (ease  of  localization  being  characterized  more  or 
less  by  computational  complexity  of  the  localization  algorithm). 

There  are  shortcomings  in  these  results,  since  they  rest  on  idealized  situations  not  necessarily 
met  in  practice.  For  example: 

•  Links  or  sensors  themselves  may  fail,  particularly  in  a  large  network.  The  existing 
theory  characterizing  network  localizability  made  no  allowance  for  this. 

•  Measurements  of  intersensor  distances  are  noisy.  The  existing  theory  made  no 
attempt  to  relate  the  errors  in  intersensor  distance  measurement  to  errors  in  the 
localized  positions  of  those  sensors  lacking  a  priori  position  knowledge 


The  failure  of  links  or  sensors  in  a  sensor  network  is  a  particular  case  of  failure  of  links  or 
nodes  in  a  normal  communications  network.  It  can  be  the  case  that  individual  link  quality  in 
a  network  can  be  characterized  by  a  simple  probability,  viz.  the  probability  that  the  link  will 
function  properly  on  any  occasion  that  it  passes  a  message.  Most  message  passing  in  a 
network  however  involves  transmission  of  a  message  over  a  series  of  links,  with  multiple 
paths  often  existing  between  source  and  destination.  A  method  has  been  lacking  for 
assembling  the  individual  link  qualities  (probabilities)  into  a  measure  for  the  overall  quality 
of  a  network. 

In  many  networks,  whether  sensor  networks  or  ad  hoc  networks  or  mobile  networks,  the 
node  positions  are  essentially  random.  For  this  reason  and  especially  for  large  scale  networks, 
it  can  be  desirable  to  characterize  the  performance  of  such  networks  in  terms  of  statistical 
parameters,  such  as  node  density,  rather  than  via  deterministic  graphical  properties. 
Accordingly,  techniques  are  needed  to  characterize  connectivity  and  throughput  in  these 
random  situations. 

In  the  course  of  our  work,  several  quite  specific  applications  problems  presented  themselves. 
These  included  the  use  of  measurements  of  Doppler  shift  for  localization.  In  a  number  of 
military  situations,  there  can  be  a  single  high  speed  target,  and  cooperative  but  spatially 
separated  sensors. 


Problems  considered 

The  problems  considered  in  the  research  were  driven  by  the  mismatch  between  available 
theory  and  the  demands  of  real-world  application  recorded  above.  The  principal  problems 
addressed  were  originally  as  follows: 

•  Finding  combinatorial  conditions  on  the  graphical  representation  of  a 
two-dimensional  sensor  network  for  the  localizability  property  to  be  retained  in  the 
event  of  loss  of  any  p  sensors  and/or  q  links  in  the  network 

•  Explaining  for  a  localizable  sensor  network  how  errors  in  inter-sensor  distances 
translate  into  errors  in  sensor  position  estimates  (and  more  generally,  how  sensing 
errors  in  any  localization  scenario  affect  the  quality  of  position  estimate  of  a  target) 


While  the  results  being  sought  were  primarily  analytic  in  character  (given  a  network,  say 
what  properties  it  has),  it  was  our  hope  that  the  results  would  give  design  insights  as  well. 


Starting  in  the  first  year,  but  continuing  much  more  in  the  second  year,  additional  problems 
were  addressed,  motivated  by  those  already  considered 

•  Defining  a  quality  measure  for  a  network  (not  a  sensor  network)  characterized  by  a 
set  of  nodes  and  a  set  of  links  with  each  link  functioning  with  a  defined  probability, 
and  such  that  the  functioning  or  otherwise  of  two  links  are  independent  events 

•  Examining  connectivity  and  throughput  properties  of  networks  where  nodes  are 
randomly  positioned  (and  may  be  moving)  and  channel  models  for  transmission 
between  nodes  are  probabilistic 

•  Localization  using  Doppler  measurements  and  sundry  other  multiagent  problems, 
involving  a  mixture  of  formations,  sensor  networks  and  consensus  ideas. 

Publications  are  recorded  at  the  end,  in  groupings  corresponding  to  the  above  five  bullet 
points. 

Contextual  matters 

The  proposal  provided  that  interaction  with  US  workers  would  occur,  and  that  presentations 
would  be  made  at  US  conferences.  Interactions  did  occur,  particularly  with  Professor  S 
Dasgupta  (University  of  Iowa)  and  Professor  A  S  Morse  (Y ale  University),  and  presentations 
at  US  conferences  did  occur.  See  the  publication  list  for  coauthored  papers,  and  US 
conference  papers. 

Much  research  was  also  linked  with  collaborative  work  with  Defence  Science  and 
Technology  Organisation,  Australia  (DSTO);  the  focus  of  that  work  was  cooperative 
localization  of  targets. 

Outline  of  main  results 

Combinatorial  conditions  on  the  graphical  representation  of  sensor  networks 

A  graphical  representation  of  a  sensor  network  is  defined  by  a  graph  G=(V,E)  in  the  usual 
sense,  where  V  is  the  vertex  set  (abstracting  the  sensors)  and  E  is  the  edge  set  (abstracting 
those  sensor  pairs  for  which  the  intersensor  distance  is  known).  Sensors  are  divided  into 
those  whose  absolute  positions  are  known,  call  them  anchor  sensors,  and  ordinary  sensors. 

Localizability  of  a  sensor  network  in  an  ambient  two-dimensional  space  occurs  if  and  only  if 
three  conditions  simultaneously  hold  (these  conditions  are  reviewed  in  the  papers  in  question, 
which  give  the  original  references;  among  these  original  references  are  papers  by  the 
Principal  Investigator) 

•  The  graph  of  the  network  is  3-connected  (^-connectedness  for  any  positive  integer  k 
is  a  standard  concept  in  graph  theory;  it  means  that  between  any  two  vertices,  k 
nonintersecting  paths,  i.e.  connected  sequences  of  edges  linking  one  vertex  to  the 
other,  can  be  found) 

•  There  are  3  or  more  noncollinear  anchor  nodes 

•  The  graph  of  the  network  is  redundantly  rigid.  (Rigidity  is  a  graph  theory  concept, 
capable  of  formal  definition  naturally,  which  captures  the  notion  that  the  graph  is  a 
pictorial  representation  of  a  rigid,  as  opposed  to  flexing,  structure.  For  example,  a 
graph  corresponding  to  a  quadrilateral  with  a  diagonal  is  a  rigid  graph,  while  a  graph 
corresponding  to  a  quadrilateral  without  a  diagonal  is  not  rigid.  ‘Redundantly  rigid’ 
means,  in  structural  terms,  that  there  are  extra  structural  members  beyond  those 


required  to  ensure  rigidity;  more  precisely,  rigidity  is  retained  when  any  one  graph 
edge  is  removed. 

Our  key  conclusions  can  be  summed  up  as  follows  and  are  obtained  in  the  papers  listed 
under  the  heading  "Redundant  Localizability’. 

•  Conditions  ensuring  retention  of  localizability  given  sensor  loss  are  more  demanding 
than  conditions  ensuring  retention  of  localizability  given  link  loss:  in  fact,  if  a 
network  is  localizable  after  loss  of  any  p  sensors  and  q  links,  it  will  be  localizable 
after  loss  of  any  p+q  links 

•  In  general,  we  could  not  find  identical  necessary  and  sufficient  conditions  for 
retention  of  localizability  after  loss  of  p  sensors  or  of  q  links,  though  the  difference 
between  necessary  and  sufficient  conditions  was  not  great. 

•  With  small  enough  values  of  p  and  q,  identical  necessary  and  sufficient  conditions 
could  however  be  found.  For  example,  a  network  with  \V\  sensors  including  3  or 
more  noncollinear  anchors  and  2\V\  links  is  localizable  when  any  single  nonanchor 
sensor  or  link  is  lost,  given  an  easily  checked  well-distributedness  property  of  the 
links.  Designs  are  easy. 

•  Networks  retaining  localizability  given  loss  of  sensors  and/or  links  necessarily  are  of 
higher  density  that  a  network  which  simply  has  to  be  localizable.  Density  can  be 
measured  by  the  ratio  of  links  to  sensors,  or  separately  by  the  maximum  value  of  k 
for  which  the  network  is  ^-connected.  For  example,  to  be  tolerant  of  the  loss  of  q 
links,  it  is  necessary  that  the  ratio  of  edges  to  vertices  exceed  (l/2)(g+3). 

•  Algorithms  for  verifying  retention  of  localizability  given  loss  of  p  sensors  and  q 
links  with  general  p,q,  i.e.  algorithms  checking  either  the  necessary  or  sufficient 
conditions,  can  be  very  complicated.  However,  there  is  one  very  easily  checked 
sufficient  condition:  if  a  network  is  (/?+</+6)-connected  and  there  are  p  +  3  or 
anchors,  no  three  of  which  are  collinear,  then  it  will  be  localizable  after  loss  of  p 
sensors  and  q  links.  We  note  too  that  a  necessary  connectivity  condition  is  simply 
that  the  network  be  (/>+^+3)-connected. 

•  A  network  whose  sensors  are  randomly  distributed  according  to  say  a  Poisson 
distribution  (and  there  are  p  +  3  or  anchors,  no  three  of  which  are  collinear)  will, 
with  very  high  probability,  be  (/>+g+6)-connected  (and  thus  robust  to  loss  of  p 
sensors  and  q  links)  given  a  sufficiently  large  (and  computable)  density  parameter 
for  the  underlying  Poisson  distribution. 

The  effect  of  measurement  errors 

The  first  issue  we  sought  to  address  was  to  understand  how  errors  in  intersensor  distances 
translate  into  errors  in  localized  sensor  positions.  We  first  undertook  this  study  for  a  power 
law,  and  found  that,  roughly  speaking,  for  a  one-dimensional  network  (such  as  might  be  used 
along  power  lines,  or  a  road  for  example),  errors  in  position  estimates  would  grow  as  the 
square  of  the  minimal  hop  count  to  an  anchor.  Work  was  then  undertaken  on 
two-dimensional  sensor  networks  and  recently  a  conference  paper  on  that  topic  was 
published.  The  results  are  so  far  quite  limited,  but  new  techniques  have  been  developed.  We 
considered  the  scenario  where  a  single  sensor  was  localized  using  noisy  range  measurements 
from  a  number  of  anchors,  which  were  randomly  located.  We  found  that  even  with  a 
surprisingly  small  number  of  anchors,  the  error  in  the  sensor  position  estimate  was  close  to 
Gaussian;  there  is  a  sort  of  spatial  central  limit  theorem  applying,  which  says  that  if  you 
average  nongaussian  random  variables,  the  average  is  nearer  to  being  Gaussian.  [What 
remains  to  be  done,  and  will  be  hard,  is  the  extension  of  this  result  to  estimating  sensor 
positions  with  noisy  measurements  from  other  sensors  that  are  not  anchors.] 


Other  work  under  this  heading  recorded  in  the  publications  was  as  follows: 

•  For  a  network  in  which  sensors  are  Poisson  distributed  with  known  density  and 

sensing  radius,  we  found  a  method  of  estimating  an  intersensor  distance  between 
any  pair  of  sensors  that  could  sense  one  another  based  on  determining  for  two 
sensors  the  number  of  other  sensors  that  were  within  the  sensing  radius  of  both,  and 
the  number  of  other  sensors  that  were  within  the  sensing  radius  of  just  one.  This 
procedure  leads  to  surprisingly  accurate  localization  using  the  estimated  distances 

•  Localizing  a  sensor  network  may  be  computationally  very  expensive.  We  found  a 

procedure  based  on  localizing  subnetworks,  such  that  if  each  subnetwork  could  be 
separately  localized,  then  it  was  easy  to  localize  the  whole  network  but  gluing 
together  the  individual  pieces 

•  We  obtained  a  general  procedure  for  approximation  computation  of  bias — which  is 

nothing  more  than  a  systematic  error — in  localization  problems.  (In  work  with 
DSTO,  the  method  was  validated  on  trial  data) 


Finding  a  network  quality  measure 

We  consider  a  graphical  model  of  a  network  with  the  vertices  corresponding  to  network 
nodes  and  the  edges  of  the  graph  corresponding  to  the  network  links.  Nodes  fall  into  two 
classes:  those  that  are  internal  to  the  network,  and  those  that  are  source/destination  nodes  (if 
a  node  can  be  a  source  node,  it  can  also  be  a  destination  node,  and  vice  versa).  Associated 
with  each  edge  is  a  weight,  corresponding  to  the  probability  that  the  link  is  operative.  Link 
failures  are  independent  of  each  other.  While  there  are  many  matrices  that  can  be  associated 
with  a  graph,  including  a  weighted  graph,  that  reflect  aspects  of  connectivity,  e.g.  adjacency 
matrix,  Laplacian  matrix,  etc,  none  of  them  is  useful  here. 

Our  main  results  are  as  follows: 

•  We  have  defined  a  probabilistic  connectivity  matrix,  with  the  property  that  the  ij 
entry  measures  the  probability  that  there  is  a  path  from  node  i  to  node  j  (Here,  the 
nodes  in  question  must  be  source/destination  nodes) 

•  We  obtained  a  (nontrivial)  algorithm  to  compute  this  matrix 

•  The  matrix  is  symmetric,  nonnegative  definite  and  has  nonnegative  entries;  its 
largest  eigenvalue  (Perron  eigenvalue)  is  a  single  measure  of  the  network  quality 

•  Preliminary  results  suggest  experiments  can  be  conducted  at  the  source/destination 
nodes  to  determine  the  largest  eigenvalue  when  the  network’s  internal  structure  or 
link  probabilities  are  unknown 

•  The  matrix  yields  various  topological  properties  of  a  network: 

o  How  reliably  a  node  can  communicate  with  another  node 
o  Existence  of  any  single  point-of-failure  (critical  nodes) 
o  Other  eigenvalues  and  their  multiplicity  say  something  about  the  topology, 
e.g.  the  number  of  strongly  connected  components 

Random  Network  Problems 

We  considered  networks  where  nodes  are  randomly  placed  in  a  given  region.  In 

some  of  our  studies,  these  nodes  are  considered  to  be  stationary  while  in  others, 

these  nodes  are  considered  to  be  mobile.  A  link  exists  between  a  pair  of  nodes  if 
their  Euclidean  distance  is  smaller  than  a  given  threshold,  known  as  the  transmission 

range.  Under  the  above  setting,  we  studied  the  connectivity  property  and  the 

information  propagation  process  of  these  networks  under  a  series  of  scenarios  not  yet 
addressed  in  the  literature. 


Our  results  are  as  follows: 

•  We  have  found  the  necessary  and  sufficient  condition  for  a  network  with 
stationary  nodes  to  be  connected  where  the  link  between  a  pair  of  nodes  is 
subject  to  the  detrimental  impact  of  interference  caused  by  other 
simultaneously  active  transmissions. 

•  We  derived  analytical  expression  for  the  capacity  of  the  above  network. 

•  Considering  a  vehicular  network  with  vehicular  nodes  deployed  along  a  line 
and  with  regularly  placed  roadside  infrastructure,  we  found  the  probability  that 
a  randomly  chosen  node  can  access  the  infrastructure  either  directly  or  via 
other  nodes  as  relays,  and  the  probability  that  all  nodes  can  access  at  least 
one  infrastructure  node.  The  two  probabilities  are  important  measures  of 
network  performance  from  a  user  and  a  network  service  provider  perspective 
respectively. 

•  Considering  a  mobile  network  in  two-dimensional  space  where  each  node  is 
initially  randomly  placed  and  also  moves  randomly  in  a  confined  region,  we 
found  the  information  propagation  speed  of  the  network  and  the  percentage  of 
nodes  that  can  receive  the  information  of  a  randomly  chosen  node. 


Doppler  Localization  Problems  and  Miscellaneous  Multiagent  problems 

The  main  question  we  studied  was  how  many  spatially  separated  sensors  for  Doppler  shift 
would  be  required  to  localize  a  fast  moving  target.  The  main  conclusions  we  found  were  as 
follows: 


•  Four  spatially  separated  sensors  at  generic  positions  and  measuring  Doppler  shift 

alone  can  localize  the  target  (and  determine  its  velocity)  up  to  a  finite  number  of 
possibilities;  disambiguation  may  be  possible  depending  on  a  priori  knowledge 

•  With  five  spatially  separate  sensors  at  generic  positions,  there  is  no  ambiguity,  and 

with  six  spatially  separated  sensors,  arbitrary  positions  are  satisfactory,  except  that 
sensors  cannot  be  collocated. 

o  One  aspect  of  this  work  showed  for  the  first  time  that  although  five  sensors 
are  generically  sufficient  for  position  and  velocity  estimation  there  are 
non-trivial  counterexamples  where  five  sensors  still  lead  to  ambiguity  in 
the  both  the  position  and  velocity  estimate.  This  result  stems  solely  from 
the  nonlinear  nature  of  the  equations  describing  the  Doppler  shifts 

•  With  noisy  measurements,  there  are  good  and  bad  geometries,  and  of  course  fewer 

Doppler  measurements  will  be  enough  if  other  measurements  such  as  range  or 
bearing  should  be  available. 

Other  miscellaneous  work  included  interaction  with  Professor  A  S  Morse  of  Yale  University 
on  problems  of  consensus,  which  can  have  application  to  rendezvous  problems,  flocking 
problems  (where  a  group  of  agents  has  to  acquire  a  common  velocity,)  and  estimation 
problems  in  a  large  scale  system  where  agents  must  share  noisy  estimates  of  some 
underlying  physical  variable  to  arrive  at  a  common  estimate. 

Triggered  by  our  interest  in  retaining  functionality  of  a  sensor  network  in  the  even  of  sensor 
or  link  failure,  we  also  started  to  look  at  security  problems  in  which  malicious  interference 
with  a  network  operation  was  a  possibility. 


Future  Work  Possibilities 


The  work  has  opened  up  a  number  of  fronts  on  which  further  research  can  be  pursued.  We 
list  some  here; 

•  Networked  control  and  estimation  underpins  the  core  technology  in  numerous 
critical  infrastructure  systems;  e.g.  the  electricity,  transport  and  many  defence 
systems.  However,  the  networked  nature  of  modern  control  systems  often  make 
them  vulnerable  to  a  deliberate  attack;  e.g.  many  industrial  control  systems  may 
even  be  accessible  through  the  internet.  As  an  outgrowth  from  our  work  on  ensuring 
redundancy  in  sensor  networks  so  that  they  retain  functionality  given  node  or  link 
failure,  one  can  seek  to  consider  design  methods  to  secure  redundancy  giving 
protection  against  attacks.  The  broad  question  is:  How  can  networked  control  and 
estimation  systems  be  protected  from  malicious  cyber  attackers?  An  emphasis  on 
robust  control,  estimation  and  robust  security  is  a  key  driver  of  this  avenue  of 
research  that  is  already  being  considered  within  NICTA  and  the  ANU.  The  outcome 
of  this  work  is  envisioned  to  be  a  suite  of  algorithms,  tools  and  design  considerations 
for  safeguarding  networked,  industrial,  control  and  estimation  systems  from  attack. 

•  As  the  examples  above  cited  above  imply,  many  modern  networked  systems  contain 
a  mixture  of  control  systems  and  communication  systems.  Such  systems  also  are 
never  fixed;  devices  join,  links  are  made  and  broken  etc.  It  is  clear  that  the  design 
of  the  communication  system  part  of  these  complex  systems  should  not  be  separated 
from  the  design  of  the  control  system  part.  Indeed,  a  seamlessly  integrated  system 
can  lead  to  an  even  greater  benefit  and  in  many  cases  is  a  prerequisite  for  the  entire 
system  to  perform  satisfactorily.  Our  goal  would  be  to  develop  tools  for 
simultaneous  communication  network  design  and  control  system  design  to  cater  for 
the  intricate  demands  of  these  complex  systems  and  to  integrate  and  exploit  the 
characteristics  and  dynamics  of  complex  systems. 

Milestones 

The  following  milestones  were  agreed  for  the  second  year. 


Item 

Status 

Submit  two  papers  on  formations  and/or 
sensor  networks  retaining  g lobal  rigidity  and 
localizability  properties  in  event  of  loss  of  p 
nodes  (and/or  q  communication  links] 

Done 

Submit  two  papers  on  the  connectivity  of 
wireless  sensor  networks  where  sensors  are 
connected  following  some  realistic  channel 
model  and  connections  are  subject  to 
interference 

Done 

Submit  paper  on  at  least  one  other  problem 
(e.g.  distinguishing  worst  case  and  average 
case  of  robustness,  error  and  bias  analysis, 
deterministic  gossip  problem,  Doppler-based 

Easily  exceeded 

estimation  etc) 

Arrange  visit  by  a  second  US  visitor  and 
provide  a  report  on  expected  downstream 
outcome 

Done;  Prof  AS  Morse  visited  in  March 
2012 

Attend  at  least  one  international  conference  in 
the  US  to  disseminate  outcomes  and 
networking 

Done.  PI  and  three  CIs  all  did  this. 

Submit  final  report  [both  years]  to  AOARD 

This  document  is  the  final  report 
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Multi- Agent  Rigid  Formations:  A  Study  of  Robustness  to  the 

Loss  of  Multiple  agents 
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Abstract — In  this  paper  we  study  the  robustness  of  information 
architectures  to  control  a  formation  of  autonomous  agents.  If  agents 
are  expected  to  work  in  hazardous  environments  like  battle-fields,  the 
formations  are  prone  to  multiple  agent/link  loss.  Due  to  the  higher 
severity  of  agent  loss  than  link  loss,  the  main  contribution  of  this  paper 
is  to  propose  information  architectures  for  shape-controlled  multi-agent 
formations,  which  are  robust  against  the  loss  of  multiple  agents.  A 
formation  is  said  to  be  rigid  if  by  actively  maintaining  a  designated  set  of 
inter-agent  distances,  the  formation  preserves  its  shape.  We  will  use  the 
rigidity  theory  to  formalize  the  robust  architecture  problem.  In  particular 
we  study  the  properties  of  formation  graphs  which  remain  rigid  after 
the  loss  of  any  set  of  up  to  k  —  1  vertices.  Such  a  graph  is  called  fc-vertex 
rigid.  We  provide  a  set  of  distinct  necessary  and  sufficient  conditions  for 
these  graphs.  We  then  show  that  3-vertex  rigidity  is  the  highest  possible 
robustness  one  can  achieve  by  just  adding  a  small  number  of  edges  to 
a  minimally  rigid  graph,  i.e.  retention  of  rigidity  given  the  loss  of  3  or 
more  agents  of  a  formation  requires  many  more  inter-agent  distances  to 
be  specified  than  when  maintaining  rigidity  with  no,  one  or  two  agent 
losses.  Based  on  this  result,  we  further  focus  on  3-vertex  rigid  graphs  and 
characterize  a  class  of  information  architectures  (with  minimum  number 
of  control  links)  which  are  robust  against  the  loss  of  up  to  two  agents. 

Index  Terms — Formation  Control,  Robustness,  Rigidity,  Redundant 
Rigidity 


I.  Introduction 

Recently,  autonomous  agents  and  specifically  UAVs  (unmanned 
aerial  vehicles)  have  found  significant  interests  among  researchers. 
UAVs  have  become  an  enabling  technology  in  military  application 
such  as  surveillance  and  reconnaissance  over  several  decades  [4], 
[12].  Today,  there  is  also  an  increasing  interest  in  UAVs  for  civil 
application  such  as  environmental  monitoring  and  exploration  [1],  [7], 
In  many  applications  it  is  desirable  to  have  several  autonomous  UAVs 
flying  in  a  formation  [7],  The  formation  control  task  is  to  control, 
normally  in  a  distributed  manner,  a  set  of  inter-agent  distances  such 
that  a  prescribed  shape  is  achieved  and  the  formation  moves  as  a 
cohesive  whole  [1],  [5],  The  reason  is  that  when  these  agents,  engaged 
in  surveillance  or  exploration  missions,  move  in  a  formation  with  a 
specific  shape,  they  usually  synthesize  an  antenna  of  magnitude  far 
larger  than  a  single  agent  [1],  [14],  For  certain  distributed  antenna 
shapes,  this  can  lead  to  better  sensitivity  in  target  detection  or 
localization  over  the  area  of  interest. 

Many  of  these  missions  for  UAVs  enforce  the  existence  of  a 
high  level  of  autonomy  of  the  vehicles,  specifically  in  hazardous 
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environments  or  in  the  presence  of  communication  blackouts  [9].  This 
autonomy  can  provide  higher  precision,  fault  tolerance  and  reliability 
in  the  mission  accomplishment. 

In  hazardous  environments  such  as  battle-fields,  the  robustness  of 
the  formation  is  of  high  importance  [9].  [4],  Robustness  here,  means 
ensuring  a  successful  accomplishment  of  the  mission  in  different 
operational  scenarios,  when  both  some  agents  and  communication 
links  may  be  lost  due  to  mechanical  failure,  enemy  attack,  jamming, 
an  agent  leaves  the  formation  deliberately,  etc. 

In  [4],  as  cited  in  [17],  some  vulnerability  issues  of  UAVs  are 
studied  based  on  the  actual  records  of  the  past  and  the  potential 
issues  are  mainly  classified  as 

•  communication  link  loss  as  a  result  of  jamming  or  occlusion 

•  enemy  attack  of  one  or  more  UAVs  in  the  formation 

•  loss  of  an  agent  or  communication  link  as  a  result  of  a  mechan¬ 
ical  or  electrical  failure,  without  enemy  attack  but  possibly  due 
to  environmental  changes  (heat,  wind,  etc.). 

Based  on  the  above  study  the  authors  concluded  the  necessity  of 
reducing  the  vulnerability  of  UAV  to  such  threats. 

It  is  evident  that  the  robustness  of  the  formation  to  an  agent  loss 
demands  more  than  robustness  to  a  single  link  loss,  as  the  loss  of  one 
agent  implies  the  loss  of  all  control  links  incident  to  it.  Therefore, 
we  focus  on  the  multiple  agent  loss  problem  throughout  this  paper. 
The  issue  can  be  dealt  with  in  many  different  ways  [15].  However, 
in  this  paper  we  incorporate  a  proactive  approach.  We  introduce 
the  robustness  into  the  information  architecture  a  priori  ,  by  using 
redundant  links,  in  order  to  mitigate  the  effect  of  multiple  agent  loss. 
The  measure  of  robustness  here,  is  the  number  of  agents  the  formation 
can  afford  to  lose  while  preserving  its  cohesiveness. 

We  assume  a  graphical  abstraction  of  the  multi-agent  formation,  via 
an  undirected  graph  G  =  ( V ,  E),  in  which  each  vertex  corresponds  to 
an  agent  and  an  edge  corresponds  to  a  bidirectional  distance  control 
law,  actively  maintaining  the  distance  between  the  corresponding 
agents.  This  enables  us  to  study  the  robustness  of  these  formations 
via  rigidity  of  the  formation  graph  (an  area  of  graph  theory  that 
deals  with  the  characterization  of  graphs  corresponding  to  formations 
which  are  rigid).  Generally  speaking,  a  formation,  and  by  extension 
its  underlying  graph,  is  termed  rigid  if  the  distance  between  each 
pair  of  agents  remains  constant  over  time,  normally  through  a  subset 
of  the  inter-agent  distances  being  actively  maintained  at  prescribed 
values.  It  has  been  shown  that  this  property  is  generic,  in  the  sense 
that  the  rigidity  of  almost  all  formations  with  the  same  graph  depends 
only  on  the  topology  of  its  graph  and  not  the  actual  distances  between 
the  agents.  This  implies  that  having  enough  well-distributed  control 
links  within  the  formation  will  lead  to  a  rigid  formation  no  matter 
what  the  actual  distances  are  between  agents. 

The  rigidity  of  graphs  has  been  extensively  studied  in  the  past 
[3],  especially  those  corresponding  to  formations  in  an  ambient  two- 
dimensional  space.  However,  robustness  to  multiple  agent/link  loss 
is  a  very  new  topic.  In  [17],  robustness  is  defined  by  introducing 
the  notion  of  fc-edge/fc-vertex  rigidity  of  a  graph:  the  graph  remains 
rigid  after  the  loss  of  up  to  k  —  1  edges/vertices;  [17]  also  derived  a 
collection  of  general  properties  of  such  graphs.  In  [18],  a  character¬ 
ization  is  derived  for  fc-edge  rigidity.  The  author  in  [13]  introduced 
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birigid  graphs,  which  are  graphs  remaining  rigid  after  the  loss  of  one 
vertex.  In  a  different  direction  [15]  proposes  a  procedure  to  restore 
the  rigidity  property  of  a  rigid  formation,  when  one  of  its  agents  is 
lost. 

The  main  contribution  of  this  paper  is  to  parallel  the  path  posed 
in  [17],  [18]  and  [15],  focusing  on  the  loss  of  more  than  one  agent. 
In  a  related  study  [10],  we  have  derived  some  distinct  necessary  and 
sufficient  conditions  for  graphs  which  are  fc -vertex  globally  rigid ,  a 
stronger  condition  than  rigidity.  The  similar  results  for  rigidity  will 
be  provided  here  as  a  partial  characterization  of  fc-vertex  rigid  graphs, 
i.e.  graph  which  remain  rigid  after  the  loss  of  up  to  k  —  1  vertices. 
As  it  turns  out,  for  fc  <  3  a  size  independence  property  holds,  i.e.  the 
cost,  in  terms  of  extra  edges  required  to  secure  the  fc-vertex  rigidity 
property  as  opposed  to  mere  rigidity,  is  very  small  and  independent  of 
the  number  of  agents  in  the  formation.  As  is  later  explained,  this  low 
cost  property  is  lost  when  robustness  against  the  loss  of  more  than 
2  agents  is  sought.  This  observation  reinforces  our  focus  on  3-vertex 
rigid  graphs  with  the  minimum  edge  count.  A  partial  characterization 
of  these  graphs  will  be  derived,  followed  by  a  constructive  approach 
through  which  one  can  obtain  a  3-vertex  graph  on  arbitrary  number 
of  vertices.  This  characterization  enables  the  formation  designer  to 
embed  the  robustness  into  a  formation  prior  to  any  actual  mission. 

The  structure  of  the  paper  is  as  follows.  In  Section  II,  some  basic 
definitions  are  provided.  Section  III  contains  the  main  contribution 
of  the  paper  and  is  followed  by  the  concluding  remarks  in  Section 
IV. 

II.  Background 

Rigidity  theory  has  been  well  studied  in  the  literature  [3],  [16]. 
In  this  section,  we  introduce  some  basic  definitions  and  properties 
required  for  the  main  contribution.  The  interested  reader  may  refer 
to  [3]  for  further  details.  Throughout  this  paper  we  assume  that  the 
formation  lies  in  an  ambient  two-dimensional  space  and  is  modeled 
by  a  graph  whose  rigidity  implies  the  rigidity  of  the  formation  and 
all  generic  formations  with  the  same  graph  [3], 

A.  Rigidity  and  Minimal  Rigidity 

As  mentioned  earlier,  a  graph  is  called  rigid  if  the  only  smooth  mo¬ 
tions  are  those  corresponding  to  translation  and  rotation  of  the  whole 
formation  (see  [3]  for  a  precise  definition).  According  to  [16],  a  graph 
is  minimally  rigid  if  it  is  rigid  and  removing  any  one  of  the  edges 
results  in  a  nonrigid  graph  (for  which  any  corresponding  formation 
would  have  the  possibility  of  flexing  motions,  apart  from  translation 
and  rotation).  There  is  a  celebrated  theorem,  called  Laman ’s  Theorem 
[6],  which  gives  a  fully  combinatorial  characterization  of  minimally 
rigid  graphs.  In  [6],  it  is  also  proved  that  every  rigid  graph  contains 
a  minimally  rigid  subgraph  with  the  same  vertices. 

B.  Redundant  Rigidity 

A  graph  is  said  to  be  redundantly  rigid,  if  it  is  rigid  and  after 
removing  any  of  the  edges  it  still  remains  rigid.  This  notion  can 
be  generalized  to  the  loss  of  k  edges  and/or  k  vertices.  A  graph 
will  be  termed  k-edge  rigid  if  after  deletion  of  any  set  of  up  to 
k  —  1  edges,  a  rigid  graph  always  results.  With  the  same  notion  a 
graph  is  k-vertex  rigid  if  after  deletion  of  any  set  of  up  to  k  —  1 
vertices,  the  resulting  graph  is  still  rigid.  As  mentioned  before,  we 
are  interested  in  addressing  multiple  agent  loss.  Therefore,  in  the 
remainder  of  the  paper  we  study  fc-vertex  rigidity  of  the  underlying 
graphs.  The  following  theorem  and  lemma  were  first  proved  in  [17] 
and  due  to  their  relevance  are  restated  here  (we  provide  the  proof  of 
Lemma  2  as  it  summarizes  several  observations  in  [17]): 


Theorem  1 .  If  G  =  ( V. ,  E)  is  a  k-vertex  rigid  graph,  then  each 
vertex  has  a  degree  of  at  least  k  +  1. 

Lemma  2.  In  a  k-vertex  rigid  graph  G  =  (V,  E),  |V|  >  k  +  3 
except  for  Kk+ 2,  the  complete  graph  on  k  +  2  vertices. 

Proof:  Suppose  to  obtain  a  contradiction  that  \V\  <  fc  +  2  and 
G  is  not  Kk+2-  The  fc-vertex  rigidity  of  G  implies  8g(v)  >  fc  +  1 
(degree  of  vertex  v  £  V)  which  implies  that  \V\  cannot  be  less 
than  fc  +  2.  However,  the  only  graph  with  \V\  =  fc  +  2  in  which 
5g(v )  >  fc+1  holds  is  Kk+2 ■  This  contradiction  implies  \V\  >  fc+3. 

■ 

A  graph  is  called  minimally  fc-vertex  (fc-edge)  rigid,  if  it  is  fc-vertex 
(fc-edge)  rigid  but  after  removing  any  one  of  the  edges  the  resulting 
graph  is  no  longer  fc-vertex  (fc-edge)  rigid. 

In  the  case  of  mere  rigidity,  the  definition  of  minimal  rigidity  is 
proved  to  be  equivalent  to  an  alternative  statement:  a  rigid  graph 
is  called  minimally  rigid  if  it  has  the  minimum  number  of  possible 
edges  among  all  rigid  graphs  with  the  same  number  of  vertices. 
Unfortunately,  for  general  redundant  rigidity  (fc-edge  rigidity  or  fc- 
vertex  rigidity),  these  two  notions  are  no  longer  equivalent;  there  are 
some  graphs  which  are  minimally  redundantly  rigid  but  the  number  of 
edges  is  not  the  minimum  possible  one  among  such  graphs  with  the 
same  vertex  count.  This  property  of  redundant  rigidity  leads  us  to  two 
different  notions:  strongly  minimal  and  weakly  minimal  redundant 
rigidity  [15], 

•  A  fc-vertex  (k-edge)  rigid  graph  is  said  to  be  strongly  minimal  if 
it  has  the  minimum  possible  number  of  edges  on  a  given  number 
of  vertices. 

•  A  fc-vertex  (k-edge)  rigid  graph  is  said  to  be  weakly  minimal 
if  it  has  more  than  the  minimum  possible  number  of  edges  on 
a  given  number  of  vertices,  but  has  the  property  that  removing 
any  edge  destroys  fc-vertex  (fc-edge)  rigidity. 

Later,  we  will  need  the  characterizing  conditions  for  2-vertex  rigidity 
as  an  special  case.  The  remainder  of  this  section  contains  the  results 
to  characterize  strongly  minimal  2-vertex  rigid  graphs  [13],  Figure  1, 
originally  from  [13],  depicts  examples  of  strongly  minimal  2-vertex 
rigid  graphs. 


(a)  (b) 


Figure  1 .  Examples  of  the  2  possible  partitions  of  the  edge  set  for  strongly 
minimal  2-vertex  rigid  graphs:  (a)  the  degree  3  vertices  are  adjacent,  (b)  the 
degree  3  vertices  are  non-adjacent 

Lemma  3 .  If  G  =  (V,  E)  is  a  2-vertex  rigid  graph  on  5  or  more 
vertices,  then  \E\  >  2  \  V\  —  1. 

Theorem  4.  Let  G  =  ( V ,  E)  be  a  strongly  minimal  2-vertex  rigid 
graph  on  5  or  more  vertices.  Then  G  has  exactly  3  vertices  with 
degree  3  and  the  remaining  vertices  have  degree  4,  which  implies 

\E\  =  2\V\  —  1. 

Theorem  5.  A  graph  G  =  ( V ,  E)  is  strongly  minimal  2-vertex  rigid 
if  and  only  if  G  has  exactly  two  vertices  of  degree  3  and  there  is  a 
partition  of  the  edge  set  E 

E  =  Ei  U  E2  U  ...  U  Ek 

such  that  the  graph  induced  by  E\Ei  is  minimally  redundantly 
rigid  {i.e.  the  removal  of  any  edge  destroys  redundant  rigidity)  for 
all  i,  and  either 
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•  E i  and  E2  are  the  edges  incident  to  the  two  non-adjacent 
vertices  of  degree  3,  respectively,  and  Ei  is  a  single  edge  for 
3  <  i  <  fc  ,  or 

•  E 1  is  the  union  of  the  edges  incident  to  the  two  adjacent  vertices 
of  degree  3,  and  Ei  is  a  single  edge  for  2  <  i  <  fc. 

III.  Results 

In  this  section  we  start  by  studying  a  sufficient  condition  for 
fc-vertex  rigidity  (Section  III-A).  This  will  enable  us  to  propose 
formation  structures  which  are  robust  against  the  loss  of  up  to  fc  —  1 
agents.  Then,  from  a  different  perspective,  the  structure  of  fc-vertex 
rigid  graphs  is  studied  and  a  distinct  necessary  condition  is  obtained 
(Section  III-B).  We  will  show  the  size  independence  property  for 
fc  <  3  and  display  the  optimality  of  structures  which  are  3-vertex 
rigid  (highest  value  of  k  with  size  independence  property).  Therefore, 
we  will  some  characterization  of  the  special  case  of  3-vertex  rigidity. 
It  will  be  done  by  introducing  some  results  on  the  relation  between 
vertex  and  edge  counts  and  vertex  degrees  of  these  graphs.  Starting 
from  \V\  =  6,  we  will  propose  a  constructive  approach  which  can 
grow  such  graphs  to  arbitrary  size  via  an  extension  operation. 

A.  Sufficient  Condition  for  k-Vertex  Rigidity 

The  notion  of  fc-connectivity  has  been  well  studied  in  the  literature 
and  there  are  efficient  algorithms  to  check  this  property  on  a  given 
graph  [11],  The  idea  here  is  to  find  a  (fc  +  ^-connectivity  condition 
which  is  sufficient  to  fc-vertex  rigidity.  The  motivation  comes  from 
[8]  where  it  is  shown  that  in  2D,  6-connectivity  implies  rigidity  and 
6  is  the  least  possible  number  for  this  condition  (fc-connectivity  with 
fc  <  6  is  not  sufficient  for  rigidity).  The  following  theorem  is  derived 
from  a  stronger  theorem  we  first  proposed  in  [10]. 

We  can  extend  this  result  to  the  case  of  fc-vertex  rigidity,  as  the 
following  theorem  shows: 

Theorem  6.  Assume  that  G  =  (V,  E)  is  a  (fc  +  5 )-connected  graph. 
Then  G  is  k-vertex  rigid. 

The  importance  of  this  result  is  that  it  shows  that  the  more 
complicated  and  awkward-to-verify  property  of  fc-vertex  rigidity  can 
be  reduced  to  a  well-known  and  easier  to  study  property  of  fc- 
connectivity. 

B.  Necessary  Condition  for  k-Vertex  Rigidity 

In  this  subsection  we  derive  a  necessary  condition  similar  to  what 
we  have  derived  in  [10]  but  for  fc-vertex  rigidity.  This  result  gives  a 
lower  bound  on  the  number  of  edges  of  such  graphs.  From  [6],  [13] 
we  know  that  the  minimum  required  number  of  edges  for  a  graph  to 
be  rigid  and  2-vertex  rigid  are  \E\  =  2  |Vj  —  3  and  \E\  =  2  \  V\  —  1, 
respectively.  This  observation  leads  us  to  an  important  question:  since 
in  both  the  above  lower-bounds  the  number  of  edge  is  twice  the 
number  of  vertices  (|U|)  plus  a  constant,  can  we  conjecture  that  for 
any  fc  >  2,  the  edge  count  of  any  strongly  minimal  k-vertex  rigid 
graph  satisfies  the  condition  \E\  =  2  \  V\  +c(fc)  (c  is  independent  of 
|Vj)?  Unfortunately,  the  answer  is  no  and,  as  we  show  in  Theorem 
8,  this  property  is  only  valid  for  fc  <  3.  For  such  values  of  fc  we  say 
that  the  fc-vertex  rigid  graph,  has  the  size  independence  property, 
meaning  that  the  number  of  edges  required  to  obtain  a  fc-vertex  rigid 
graph  from  a  rigid  graph  is  independent  of  the  number  of  vertices 

(M). 

Theorem  7  gives  a  lower  bound  on  the  edge  count  of  a  fc-vertex 
rigid  graph.  We  should  mention  that  the  same  result  is  valid  for  fc- 
edge  rigid  graphs  as  well. 


Theorem  7.  In  a  strongly  minimal  k-vertex  rigid  graph,  with  fc  >  2, 
the  edge  count  is  under-bounded  by  the  formula  \E\  >  |Vj]  + 

c(fc),  where  c(k )  is  an  integer  ( c  is  independent  of  |  V|  but  depends  on 
fc)  and  for  k  >  3,  if  the  equality  holds  (i.e.\E\  =  |V|"|  +c(k)), 

then  c(k)  >  0. 

Proof:  Assume  that  G  =  (V,  E)  is  a  strongly  minimal  fc-vertex 
rigid  graph  of  |V|  >  fc  +  3  vertices,  whose  number  of  vertices  is 
\E\  =  a  \V\  +  c(fc)  (according  to  Theorem  11  in  [10]  such  a  graph 
exists),  where  c(fc)  is  independent  of  \V\.  According  to  Theorem  1, 
Si  >  fc+1  holds.  Therefore,  the  average  degree  in  G  is  8avg  >  fc  +  1. 
On  the  other  hand,  5avg  =  jpp  =  2a  +  .  Hence,  2 a  +  2\y\  > 

fc  +  1  =>  fc  <  (2a  -  1)  + 

Since  the  property  must  hold  for  graphs  of  arbitrary  size  and  in 
particular  arbitrarily  large  |V|,  assuming|U|  >  2 (c(fc),  we  will  have 
fc  <  2a—  1  or  a  >  ^±1.  Therefore,  \E\  >  |V|]  +c(fc)  holds 

for  \V\  >  2c(fc). 

Now  suppose  that  for  some  |Vj  the  equality  holds  (i.e.  |.E|  = 
|V|]  +  c(fc)).  We  prove  that  for  such  a  strongly  minimal  fc- 
vertex  globally  rigid  graph  c(fc)  >  0  always  holds. 

First  suppose  fc  is  odd.  Then,  we  will  have  5avg  =  fc  + 1  +  2]ffl  > 
fc  +  1,  which  implies  c(fc)  >  0. 

If  fc  is  even,  then  \E\  =  |  |U|  +  [^r]  +c(fc)  which  gives  |fc?|  < 
|  |V|+c(fc)  +  4^  +  l.  Therefore,  5avg  <  &  +  ^p  +  l  +  pr|  holds. 
This  implies  fc  +  l<fc+l  +  2°ffl  +  ypy  which  gives  —  1  <  c(fc) 
and  so  c(fc)  >  0  holds.  ■ 

Theorem  8.  The  highest  value  of  k  which  the  k-vertex  rigid  graph 
Gk  =  ( V ,  E)  can  have  the  size  independence  property  is  fc  =  3. 

Proof:  As  is  shown  in  the  proof  of  Theorem  7,  the  inequality 
fc  <  2a  —  1  holds.  On  the  other  hand,  if  Gk  has  the  size  independence 
property,  then  we  necessarily  must  have  a  =  2  (as  for  rigidity 
and  2-vertex  rigidity  a  =  2).  This  implies  fc  <  3.  Therefore 
argmaXk(Gk)  =  3  holds  for  size  independent  graphs  Gk  and  the 
proof  is  complete.  ■ 

Front  this  result  we  can  conclude  that  a  formation  designer  can 
embed  the  robustness  to  the  loss  of  up  to  two  agents  into  a  formation 
of  any  agent  count,  by  adding  only  a  fixed  set  of  control  links  with 
possible  redistribution  of  some  edges  to  be  incident  on  different  vertex 
pairs.  Actually  by  studying  3-vertex  rigid  graphs  in  the  next  section, 
we  will  derive  the  exact  number  of  required  control  links  to  obtain 
such  a  robustness  level  (Lemma  9). 

The  size  independence  property  is  not  true  when  there  is  robustness 
to  the  loss  of  more  than  two  agents.  Table  I  shows  an  example  of 
a  formation  with  100  agents  and  the  required  number  of  control 
links  for  any  desired  level  of  robustness.  As  is  evident,  there  is  a 
considerable  increase  in  the  required  number  of  control  links  when 
we  want  to  obtain  robustness  to  the  loss  of  more  than  two  agents. 


#  of  agent  losses  tolerated 

#  of  required  links 

0 

197 

1 

199 

2 

202 

3 

250  +  c+ 

Table  1 

The  required  number  of  control  links  in  a  100-agent  formation 

WITH  DIFFERENT  ROBUSTNESS  PROPERTIES.  C+  IS  A  POSITIVE  INTEGER. 


C.  Relations  between  edge  count,  vertex  count  and  vertex  degrees 

In  this  subsection,  we  study  3-vertex  rigidity  by  presenting  some 
results  linking  the  vertex  and  edge  counts  and  vertex  degrees.  The 
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first  result  underbounds  the  edge  count  in  a  3-vertex  rigid  graph  in 
terms  of  the  vertex  count. 

Lemma  9.  If  G  =  (V,  E)  is  a  3-vertex  rigid  graph  with  \V\  >6, 
then  \E\>2\V\+2. 

Proof:  First  we  prove  that  the  inequality  \E\  >  2  |V|  +  1  holds. 
It  is  then  enough  to  show  that  there  is  no  3-vertex  rigid  graphs  on 
\E\  =  2\V\  +  1  edges. 

For  the  inequality,  to  obtain  a  contradiction,  suppose  G  is  a  3- 
vertex  rigid  graph  and  has  \E\  <  2  \V\  edges.  In  this  case  the  average 
vertex  degree  is  at  most  4.  Since  all  vertex  degrees  must  be  at  least 
4,  this  implies  that  the  degree  of  each  vertex  is  precisely  4,  so  that 
\E\=2\V\. 

Now  consider  vi,V2  £  V  of  degree  4  which  are  not  adjacent  to 
each  other  (and  always  exist  since  \V\  >  6)  ,  and  remove  them  front 
G  to  produce  a  graph  G'  =  (V',E').  Then  \E'\  =  2|V|  —  8  = 
2  \  V'\  —  4.  Obviously  G'  is  not  rigid,  which  contradicts  the  fact  that 
G  is  3-vertex  rigid. 

To  prove  that  |S|  =  2\V\  +  1  is  impossible,  suppose  that  the 
equality  holds  and  set  n  =  |Vj.  The  average  vertex  degree  in  G  is 

2_1  —  ELEi  =  4r*+2  =  4  _)_  1  >4.  Therefore,  there  is  at  least  one 
vertex  with  degree  of  more  than  4.  Now  suppose  Vi,V2  £  V  with 
degrees  k\  and  &2,  respectively.  Observe  that  G—v  1—V2,  obtained  by 
removing  vi ,  V2  and  all  their  incident  edges  from  G,  has  n—2  vertices 
and  at  most  \E\  =  2n+ 1  —  (fci  +  fc-2  —  1)  =  2(n  —  2)  —  (hi  +  fc-2  —  6) 
edges  (the  bound  being  achieved  when  vi ,  V2  are  neighbors).  Since 
G  —  V1—V2  is  rigid,  \E\  >  2(n—  2)  —  3  must  hold.  So  fci  +  fc2  —  6  <  3 
or  fci  +  &2  <  9.  By  considering  fci  >  4,  k2  >  4  we  conclude  that 
4  <  fci  <  5  and  4  <  <  5.  Finally,  if  there  are  m  vertices  of 

degree  5,  we  have  5m  +  4(n  —  m)  =  2(2n+  1),  which  gives  m  =  2. 
This  means  that  such  a  graph  (if  it  exists)  should  have  exactly  two 
vertices  of  degree  5  which  are  adjacent  and  the  others  with  degree  4. 

Now  we  prove  that  such  a  graph  cannot  be  3-vertex  rigid.  First 
consider  the  case  that  n  =  6.  In  this  case  all  vertices  of  degree  4 
are  connected  to  vertices  of  degree  5  (figure  2).  It  is  obvious  that  by 
removing  vertices  with  degree  5,  the  resulting  graph  is  not  rigid.  Now 
consider  n  >  7.  In  this  case  there  are  at  least  one  pair  (iq ,  V2)  with 
degrees  4,  5  which  are  not  adjacent.  G  —  v  1  —  V2  has  2n+l  —  5  —  4  = 
2n  —  8  =  2 (n  —  2)  —  4  edges  which  contradicts  the  fact  that  it  is 
rigid.  This  completes  the  proof.  ■ 


Figure  2.  A  graph  with  two  5-degree  and  four  4-degree  vertices. 

Figure  3  shows  an  example  of  a  strongly  minimal  3-vertex  rigid 
graph  with  |.E|  =  2  \  V\  +  2  on  6  vertices.  Later  in  Section  III-D,  we 
will  show  that  this  class  of  graphs  is  an  infinite  set,  by  introducing  an 
extension  operation  which  can  grow  a  strongly  minimal  3-vertex  rigid 
graph  by  one  vertex.  In  the  following  theorem  an  in  anticipation  of 
the  demonstration  of  existence  of  such  graphs  for  arbitrary  |Vj  >6, 
we  present  a  condition  on  the  vertex  degrees  of  these  graphs. 

Theorem  10.  There  exists  a  strongly  minimal  3-vertex  rigid  graph 
G  =  ( V ,  E)  with  \E\  =  2  |Vj  +  2  for  any  \V\  >  6.  In  G,  there  are 
exactly  4  vertices  with  degree  of  5  which  are  all  adjacent  (forming 
a  K4  subgraph )  and  all  other  vertices  have  degree  of  4. 

Proof:  This  graph  has  an  average  degree  of  4+  -  (taking  \V\  = 
n).  Therefore,  there  is  at  least  one  vertex  with  degree  of  more  than 
4.  Now  suppose  vi,V2  £  V  with  degrees  ki ,  k2,  respectively.  Then, 
G  —  v  1  —  V2  has  n  —  2  vertices  and  at  most  | E(G  —  vi  —  V2)\ 


2n  +  2  —  (fci  +  &2  —  1)  =  2  (n  —  2)  —  (fci  +  k2  —  7)  edges  (when 
vi,V2  are  neighbors).  Since  G—v  1  —  V2  is  rigid.  \E(G  —  vi  —  V2)\  > 
2  (n  —  2)  —  3  holds.  Hence,  fci  +  k2  —  7  <  3  or  fci  +  fe  <  10.  By 

considering  fci  >  4,  k2  >  4  we  conclude  that  4  <  fci  <  6  and 

4  <  fc2  <  6. 

Finally,  if  there  are  m  vertices  of  degree  6  and  t  vertices  of  degree 
5,  we  have  6m  +  5t  +  4 (n  —  m  —  t)  =  2(2 n  +  2),  which  gives 
2m  +  t  =  4.  There  are  3  possible  cases: 

a.  m  =  1,  t  =  2:  in  this  case  if  we  remove  the  6-degree  vertex 

in  addition  to  a  5-degree  one,  the  number  of  edges  will  be  at  most 
| E(G  -  vi  -  v2)\  =  2n  +  2  -  (6  +  5  -  1)  =  2n  -  8  =  2{n  - 

2)  —  4  which  contradicts  the  fact  that  G  —  v  1  —  V2  is  rigid  and 

| E(G  —  Vi  —  V2)\  >  2(n  —  2)  —  3.  Therefore,  this  case  cannot  occur. 

b.  m  =  2,  f  =  0:  with  the  same  argument  as  the  case  a,  by 
removing  two  vertices  with  degree  of  6  we  have|Ti(G  —  V]  —  V2)\  = 
2n  +  2  —  (6  +  6  —  1)  =  2n  —  9  =  2 (n  —  2)  —  9  and  it  is  obvious  that 
the  resulting  graph  is  not  rigid.  Hence  again,  this  case  cannot  occur. 

c.  m  =  0,  (  =  4:  The  proof  of  this  case  is  trivial.  The  only 

important  condition  is  that  all  5-degree  vertices  should  be  adjacent. 
Otherwise,  removing  any  2  of  them  results  in  a  non-rigid  graph 
(| E(G  —  V\  —  U2I  =  2 (n  —  2)  —  4  <  2 (n  —  2)  —  3).  Figure  3  shows 
such  a  graph  with  6  vertices.  ■ 


Figure  3.  Strongly  Minimal  3-vertex  rigid  graph  with  |Vj  =  6. 

Remark  1 1.  Observe  that  the  condition  provided  by  theorem  10  is  not 
sufficient  to  ensure  strongly  minimal  3-vertex  rigidity.  As  a  counter 
example,  consider  the  graph  G  shown  in  Figure  4  (left  side).  It  is  easy 
to  observe  that  this  graph  satisfies  Theorem  10.  However,  as  shown 
in  the  right  side  of  the  figure,  removing  vertices  shown  by  unfilled 
circles  results  in  a  non-rigid  graph.  Therefore,  G  is  not  3-vertex  rigid. 


Figure  4.  No  3-vertex  rigid  graph  which  satisfies  Theorem  10 


D.  Growing  strongly  minimal  3-vertex  rigid  graphs 

In  this  section  we  will  show  an  infinite  class  of  strongly  minimal  3- 
vertex  rigid  graphs.  The  approach  is  to  propose  an  extension  operation 
which  can  be  applied  in  any  strongly  minimal  3-vertex  rigid  graph  and 
increases  its  vertex  count  by  one.  This  operation  is  a  special  case  of  X- 
replacement  operation  [15]  used  before  for  extending  weakly  minimal 
2-vertex  rigid  graphs  -  we  call  this  new  one  4-5  X-Replacement ,  for 
reasons  which  are  about  to  become  apparent.  Suppose  that  the  original 
graph  is  G  =  {V,  E).  Choose  two  edges  ei  =  {a,  b}  and  e2  =  {c,  d} 
and  ei,  e2  £  E  so  that  a,  c  have  degree  4  and  are  non-adjacent  and 
6,  d  have  degree  5  (that  such  a  choice  is  in  fact  possible  is  proved 
below).  Remove  ei,  e2  and  add  a  new  vertex  called  z.  Connect  3  to 
a,  b,  c,  d.  In  4-5  X-Replacement  operation  the  degree  of  the  original 
vertices  remains  the  same  and  a  new  vertex  of  degree  4  is  added 


3605 


to  G.  Therefore,  the  new  graph  satisfies  the  conditions  of  the  above 
theorems. 

It  remains  to  prove  that  this  operation  preserves  strongly  minimal 
3-vertex  rigidity.  First  we  provide  a  lemma  which  guarantees  that 
there  are  always  appropriate  edge  choices  for  4-5  X-replacement 
operation. 

Lemma  12.  If  G  =  ( V ,  E)  is  a  strongly  minimal  3-vertex  rigid 
graph,  there  are  always  two  vertices,  a,  c  say,  with  degree  of  4  which 
are  not  adjacent,  yet  are  connected  to  two  different  vertices,  b  and  d 
say,  with  degree  of  5. 


removed.  Therefore,  one  can  interpret  G'  —  x  —  y  as  an  extension  of 
G  —  x  —  y  by  an  ordinary  X-replacement  operation.  This  has  been 
proved  to  preserve  rigidity  of  G'  —  x  —  y  [15],  Hence,  in  this  case 
the  result  of  the  operation  is  still  a  rigid  graph. 

b.  x  £  {a,  6}  (or  x  £  {c,  d}  which  are  the  same  by  the  symmetry 
property)  and  y  £  Vr\{a,  b,  c,  d}:  Without  loss  of  generality  suppose 
that  for  this  case  x  =  a  (Figure  7).  Therefore,  G'  —  x  —  y  can  be 
obtained  from  G  —  x  —  y  by  a  standard  edge  splitting  operation  which 
preserves  rigidity  [16],  Since  G  —  x  —  y  is  rigid,  G'  —  x  —  y  is  also 
rigid. 


Proof:  Since  the  vertices  of  degree  4  which  are  connected  to  the 
core  (vertices  of  degree  5)  can  only  be  adjacent  to  at  most  3  other 
vertices  of  the  same  degree  5,  if  the  number  of  vertices  of  degree  4 
is  more  than  4,  there  are  at  least  two  of  them  which  are  not  adjacent. 
Therefore,  the  theorem  is  proved  for  \V\  >8.  Since  \V\  >  6  holds 
in  general,  we  need  to  prove  the  lemma  for  |Vj  =  6,  7  and  8.  From 
Figure  3  it  is  evident  that  in  the  only  strongly  minimal  3-vertex  rigid 
graph  with  6  vertices,  the  only  two  4-degree  vertices  are  not  adjacent. 
Finally,  it  is  trivial  to  see  that  (up  to  isomorphism)  the  graphs  in 
Figure  5  are  the  only  possible  strongly  minimal  3-vertex  rigid  graphs 
of  size  7  and  8  respectively.  ■ 


Figure  5.  Strongly  minimal  3-vertex  rigid  graphs  of  size  7  (a)  and  8  (b). 
Vertices  depicted  by  unfilled  circles  are  4-degree  but  not  adjacent. 


Figure  7.  Removing  x  =  a  and  y  £  V\{a,  b,  c,  d\  from  G  and  G' .  The 
operation  reduces  to  the  edge  splitting  operation  on  (c,  d)  edge. 

c.  x  £  {a,  6}  and  y  £  {c,  d}  (one  vertex  from  each  edge).  Again 
suppose  that  -without  loss  of  generality-  x  =  a  and  y  =  c  (Figure 
8).  One  can  simply  observe  that  G'  —  x  —  y  can  be  produced  by 
applying  a  vertex  addition  to  G'  —  x  —  y  which  preserves  rigidity 
when  applied  to  a  rigid  graph  [16].  Therefore,  G'  —  x  —  y  is  rigid. 


Theorem  13.  Suppose  G  =  (V,  E)  is  a  strongly  minimal  3-vertex 
rigid  graph.  After  applying  the  4-5  X-Replacement  operation  on  G, 
the  resulting  graph  G'  is  strongly  minimal  3-vertex  rigid  on  |Vj  +  1 
vertices. 

Proof:  In  this  proof  we  show  that  by  removing  any  pair  of 
vertices  from  the  graph  G'  after  4-5  X-replacement,  the  resulting 
graph  is  still  rigid.  Suppose  (a,  b)  and  (c,  d)  are  two  candidate  edges 
to  be  removed  from  G  in  4-5  X-replacement,  with  5a  =  Sc  =  4  and 
5b  =  5d  =  5.  The  new  vertex  to  be  connected  to  all  of  a,  b,  c,d  is  e 
(Figure  6). 


Figure  6.  4-5  X-replacement  operation,  b,  d  have  degree  5  while  a,  c  have 

degree  4. 


Figure  8.  Removing  x  =  a  and  y  =  c.  The  operation  reduces  to  vertex 
addition  operation  in  which  e  is  connected  to  b,  d  by  2  edges. 


d.  x  =  a  and  y  =  b  (both  are  the  extremes  of  one  edge):  Without 
loss  of  generality  suppose  that  x  =  a  and  y  =  b  (Figure  9).  After 
removing  vertex  b,  the  resulting  graph  (G'  =  G  —  b)  is  still  2-vertex 
rigid  and  has  \E'\  =  2  |  Vj  +  2  —  5  =  2  \V'\  —  1  edges.  According  to 
theorem  4  and  5,  it  is  strongly  minimal  2-vertex  rigid  with  exactly 
two  vertices  of  degree  3  (which  had  been  4-degree  vertices  connected 
to  b  before  it  was  removed).  Since  vertex  a  is  one  of  the  3-degree 
vertices  in  this  graph,  according  to  theorem  5,  removing  it  and  all 
of  its  incident  edges  ensures  that  the  resulting  graph  is  redundantly 
rigid.  Therefore,  G  —  x  —  y  —  (c,d)  is  still  rigid.  One  can  easily 
conclude  that  adding  e  with  two  edges  to  this  graph,  which  forms 
G'  —  x  —  y,  is  a  case  of  vertex  addition  extension  of  a  rigid  graph 
and  therefore  G'  —  x  —  y  is  rigid. 


There  are  6  distinct  cases  based  on  different  choices  for  the  vertex 
pair  to  be  removed  from  G' .  For  further  reference  we  call  these 
vertices  x  and  y. 

a.  x,y  £  V\{a,b,  c,  d}:  We  know  that  G  —  x  —  y  is  rigid.  In 
this  case,  none  of  the  edges  connecting  e  to  one  of  {a,  b,  c,  d}  is 


Figure  9.  Removing  x  =  a  and  y  =  b. 

e.  x  =  e.  In  this  case  if  y  £  {a,  b,  c,  d},  we  can  observe  that  the 
difference  between  G  —  y  and  G'  —  x  —  y  is  that  G  —  y  has  an  extra 
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edge  (c,  d).  Since  G  —  y  is  2-vertex  rigid,  we  can  conclude  that  it  is 
2-edge  rigid  [18].  Therefore,  by  removing  one  edge  like  ( c,d )  from 
it,  the  result  (G'  —  x  —  y)  is  still  rigid  (Figure  10). 

If  y  (f:  (a,  b,  c,  d},  the  difference  between  G'  —  x  —  y  and  G  —  y  is 
in  two  edges  (c,  d)  and  (a,  b)  (Figure  1 1).  From  theorem  5  we  know 
that  G  —  y  —  (c,  d)  is  minimally  redundantly  rigid  (since  Ei  =  (c,  d) 
in  one  possible  partition  of  G  —  y).  Therefore,  G  —  y  —  (c,  d)  —  (a,  b) 
is  rigid.  By  looking  at  Figure  1 1  one  can  easily  realize  that  G  —  y  — 
(c,  d)  —  (a,  b)  is  G1  —  y  —  x.  Hence,  G'  —  y  —  x  is  rigid. 


Figure  10.  Removing  x  =  e  and  y  =  a. 


Figure  1 1 .  Removing  x  =  e  and  y  £  G\{a,  b,  c,  d\ 

This  completes  the  proof  of  Theorem  13.  ■ 

IV.  Conclusions 

In  this  paper  we  studied  the  robustness  of  information  architectures 
to  control  the  formation  of  autonomous  agents.  As  these  agents 
sometimes  operate  in  hazardous  environments  like  battle-fields,  they 
are  prone  to  multiple  agent  and/or  link  loss.  Due  to  the  higher  severity 
of  agent  loss  than  link  loss,  the  main  contribution  of  this  paper  is  to 
propose  information  architectures  for  multi-agent  formations,  which 
are  robust  against  the  loss  of  multiple  agents,  in  the  sense  of  retaining 
rigidity  of  the  formation  when  such  loss  occurs.  By  adopting  the 
notion  of  fc-vertex  rigid  graphs  (corresponding  to  formations  which 
are  tolerant  to  the  loss  of  up  to  k  —  1  agents,  we  derived  some 
distinct  necessary  and  sufficient  conditions.  We  also  established  the 
size  independence  property  for  k  <  3,  i.e.  the  cost,  in  terms  of  extra 
edges  required  to  secure  the  fc-vertex  rigidity  property  as  opposed  to 
mere  rigidity,  is  very  small  and  independent  of  the  number  of  agents 
in  the  formation.  As  explained  before,  this  low  cost  property  is  lost 
when  robustness  against  the  loss  of  more  than  two  agents  is  sought. 
This  observation  motivated  the  detailed  study  of  3-vertex  rigid  graphs 
with  the  minimum  edge  count.  A  partial  characterization  of  these 
graphs  was  derived,  followed  by  a  constructive  approach  through 
which  one  can  obtain  a  3-vertex  rigid  graph  on  arbitrary  number  of 
vertices.  This  characterization  enables  the  formation  to  be  designed 
with  the  robustness  against  the  loss  of  up  to  two  agents  prior  to  any 
actual  mission,  a  proactive  approach. 

The  main  assumption  in  this  paper  was  that  the  control  operation  is 
cooperative  in  the  sense  that  both  agents  at  the  ends  of  a  control  link, 
cooperate  to  maintain  the  desired  distance.  This  enabled  us  to  model 
the  formation  with  an  undirected  graph.  However,  there  are  other 
schemes  in  which  only  one  of  the  agents  is  responsible  to  maintain 
a  distance.  These  problems  and  the  robustness  of  such  formations 
can  be  studied  via  the  definition  of  persistent  formations  [2]  as  an 
extension  of  this  work. 

Finally,  although  we  provided  some  necessary  and  sufficient  con¬ 
ditions  on  fc-vertex  rigid  formations,  it  is  still  an  open  problem  to 
fully  characterize  such  architectures.  This  is  one  possibility  for  future 
work. 
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Abstract:  In  the  studies  on  the  localization  of  wireless  sensor  networks  (WSN),  it  has  been 
shown  that  a  network  is  in  principle  uniquely  localizable  if  its  underlying  graph  is  globally  rigid 
and  there  are  at  least  d+  1  non-collinear  anchors  (in  d-space).  The  high  possibility  of  the  loss 
of  nodes  or  links  in  a  typical  WSN,  specially  mobile  WSNs  where  the  localization  often  needs 
to  be  repeated,  enforces  to  not  only  have  localizable  network  structures  but  also  structures 
which  remain  localizable  after  the  loss  of  multiple  nodes/links.  The  problem  of  characterizing 
robustness  against  the  loss  of  multiple  nodes,  which  is  more  challenging  than  the  problem  of 
multiple  link  loss,  is  being  studied  here  for  the  first  time,  though  there  have  been  some  results  on 
single  node  loss.  We  provide  some  sufficient  properties  for  a  network  to  be  robustly  localizable. 
This  enables  us  to  answer  the  problem  of  how  to  make  a  given  network  robustly  localizable.  We 
also  derive  a  lower  bound  on  the  number  of  the  links  such  a  network  should  have.  Elaborating 
it  to  the  case  of  robustness  against  the  loss  of  up  to  2  nodes,  we  propose  the  optimal  network 
structure,  in  terms  of  the  required  number  of  distance  measurements. 

Keywords:  Robust  Localizability,  Localization,  Wireless  Sensor  Networks,  Global  Rigidity. 


1.  INTRODUCTION 

Knowing  the  location  of  the  nodes  in  a  wireless  sensor 
network  is  critical,  as  in  many  applications,  the  interpreta¬ 
tion  of  the  data  and  decision  making  is  impossible  without 
knowing  the  position  of  the  detected  event.  The  mobility  or 
unplanned  deployment  of  the  nodes  in  a  WSN  necessitates 
a  localization  technique  which  can  be  frequently  executed. 
A  variety  of  techniques  are  proposed  in  the  literature. 
Among  them,  there  are  schemes  in  which  only  a  small 
number  of  special  nodes  (called  anchors)  have  their  po¬ 
sitions  known  a  priori  (Aspnes  et  al.,  2006),  sometimes 
because  they  are  GPS-equipped.  Then  by  obtaining  a  set  of 
distance  measurements  between  enough  pairs  of  ordinary 
nodes,  an  algorithm  determines  the  node  positions,  using 
the  distances  and  anchor  location  data;  the  process  is 
called  network  localization. 

There  is  a  fundamental  question  in  network  localization 
that  need  to  be  answered  prior  to  the  localization  pro¬ 
cess:  what  structure  should  a  network  have,  in  order  to 
be  localizable?  It  is  important  to  see  beforehand  if  the 
network  is  localizable,  as  it  is  a  waste  of  effort  to  seek 
to  localize  a  network  which  is  actually  not  localizable. 
This  question  is  answered  in  (Aspnes  et  al.,  2006)  with 
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the  help  of  recent  results  from  graph  theory  including 
the  concept  of  Global  Rigidity.  It  is  proved  there  that  a 
network  is  uniquely  localizable  if  and  only  if  its  underlying 
graph  is  globally  rigid  and  there  are  at  least  d  +  1  non- 
collinear  anchors  in  d-space  (d  £  {2,3}).  The  underlying 
graph  G(V,E)  of  a  network  is  the  one  in  which  there  is  a 
vertex  corresponding  to  each  network  node  in  the  vertex 
set  V  and  two  vertices  are  connected  via  an  edge  in  the 
edge  set  E  if  the  distance  between  the  corresponding  nodes 
is  known  (Note  that  in  modeling  a  network,  the  graph 
itself  does  not  contain  the  length  data).  A  realization 
of  a  graph  with  associated  length  data  is  an  assignment 
of  the  vertices  of  the  graphs  to  points  in  W1  such  that  the 
distance  between  points  corresponding  to  adjacent  vertices 
in  the  graph  equals  the  distance  associated  with  the  corre¬ 
sponding  edge  of  the  graph.  A  graph  is  called  globally  rigid 
if  all  of  its  realizations  in  the  d-space  are  congruent,  i.e. 
can  be  obtained  from  another  realization  of  the  graph  only 
by  a  combination  of  reflections,  rotations  and  translations 
of  the  whole  graph.  It  is  a  nontrivial  result  of  the  theory 
that  global  rigidity  is  a  property  determinable  from  the 
graph  alone  (i.e.  without  the  distance  set)  provided  that 
the  distances  correspond  to  generic  node  positions  (e.g. 
collinearities  are  likely  to  be  excluded). 

One  of  the  most  challenging  issues  in  sensor  networks  is 
the  high  possibility  of  failures  either  in  communication 
links  or  sensor  nodes  themselves,  due  to  different  causes, 
e.g.  signal  jamming,  obstacle,  power  depletion,  mechanical 
failure,  etc.  This  may  result  in  changes  to  the  structure 
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of  the  network  (and  also  in  the  underlying  graph).  In 
other  words,  such  failures  may  cause  a  previously  uniquely 
localizable  network  to  become  non-localizable;  in  a  mobile 
network  where  re-localization  must  occur  due  to  the  mo¬ 
tion,  this  is  especially  serious.  It  is  obvious  that  coping 
with  a  single  node  loss  is  more  demanding  than  a  single 
link  loss  as  removal  of  any  node  also  results  in  the  removal 
of  all  of  its  incident  links.  The  solution  to  the  problem  of 
securing  tolerance  against  node  loss  is  to  introduce  some 
sort  of  redundancy  in  the  distance  measurements  (links), 
i.e.  having  network  structures  which  are  robust  against  the 
loss  of  some  nodes  and/or  communication  links.  Despite 
the  importance  of  this  robustness  property,  only  those 
structures  which  are  robust  against  the  loss  of  a  single 
node  appear  to  have  been  studied  up  to  now  (Summers 
et  al.,  2008).  In  (Yu  and  Anderson,  2008)  some  general 
properties  of  rigid  (a  simpler  property  than  global  rigidity) 
which  remain  rigid  after  the  loss  of  nodes/links  are  studied. 
This  work  is  extended  further  in  (Yu  et  al.,  2010)  for 
sensor  network  localization,  concerning  both  node  loss  and 
link  loss  tolerance.  The  authors  generalized  the  notion  of 
redundant  rigidity  to  (p,  g)-rigidity:  the  ability  to  retain 
(global)  rigidity  given  the  loss  of  any  p  —  1  nodes  (in¬ 
cluding  their  incident  links)  and  also  any  further  q  —  1 
links.  However,  they  only  established  properties  associated 
with  redundancy  (robustness)  under  the  loss  of  any  q  —  1 
links  ((1,  g)-rigidity).  As  is  argued  in  (Yu  et  al.,  2010), 
characterizing  ( p ,  l)-rigidity  (redundancy  under  loss  of  any 
p  —  1  vertices)  and  ( p ,  ^/-rigidity  are  still  open  problems. 

In  this  paper,  the  problem  of  robust  localizability  against 
the  loss  of  multiple  nodes  in  2D  is  being  studied  for 
the  first  time,  through  proposing  structures  for  underlying 
graph  which  remain  globally  rigid  after  the  loss  of  p 
vertices.  We  also  argue  briefly  in  Section  3.1  that  for 
networks  defined  by  random  geometric  graphs  (a  common 
assumption  for  large  scale  sensor  networks),  formulae 
are  available  indicating  a  minimal  transmission  radius 
ensuring  the  robust  localizability  property.  To  elaborate 
on  how  the  obtained  results  can  be  used,  we  study  the 
case  where  p  =  2  and  suggest  a  class  of  graphs  which  are 
robust  under  the  loss  of  up  to  2  vertices.  Such  graphs  are, 
as  we  show  in  the  paper,  optimal  in  terms  of  the  number 
of  distance  measurements  they  requires.  This  is  a  starting 
point  for  further  studies  of  such  structures  for  general  p. 

In  this  work,  we  always  assume  the  ambient  space  to  be 
2D,  unless  explicitly  noticed.  We  also  assume  that  there 
are  p  +  3  anchors  in  the  network  so  that  after  removing 
any  p  nodes  there  are  still  3  anchors  in  the  network. 
The  structure  of  the  paper  is  as  follows.  In  Section  2, 
the  required  background  is  reviewed.  Section  3  contains 
the  main  contribution  of  the  paper.  Finally,  concluding 
remarks  are  made  in  Section  4. 


2.  BACKGROUND 

In  this  section  we  recall  some  definitions  and  properties  of 
(minimal)  rigidity,  redundant  rigidity  and  global  rigidity 
and  followed  by  the  general  notion  of  redundant  rigidity. 
For  a  formal  definition  and  detailed  introduction  to  rigidity 
and  global  rigidity  please  refer  to  (Graver  et  al.,  1993; 
Connelly,  2005). 


a  b  c 


Fig.  1.  (originally  from  (Yu  and  Anderson,  2009))  Realiza¬ 
tion  of  a  graph  in  2D  that  is  (a)  non-rigid,  (b)  min¬ 
imally  rigid  and  (c)  redundantly  rigid  (also  globally 
rigid). 

2.1  Rigidity,  Minimal  Rigidity  and  Redundant  Rigidity 

Assume  in  a  realization  of  an  underlying  graph  in  an 
ambient  space  (2D  or  3D),  each  point  is  a  revolute  joint 
and  each  edge  is  a  solid  bar  with  an  specified  length.  This 
framework  (a  term  commonly  used  in  rigidity  theory),  and 
therefore  the  underlying  graph,  is  called  rigid,  if  under  any 
motion  of  the  framework  in  the  space,  the  distance  between 
each  pair  of  points  remains  constant  over  time,  no  matter 
whether  there  is  or  is  not  an  explicit  edge  connecting  them 
(see  (Graver  et  al.,  1993)  for  a  more  precise  definition). 
The  extension  of  the  term  rigid  to  refer  to  the  graph  is 
valid  since  it  can  be  shown  that  if  a  realization  of  a  graph 
is  rigid  for  one  set  of  length,  a  realization  for  almost  any 
length  set  will  also  be  rigid.  A  graph  is  minimally  rigid  if 
it  is  rigid  and  removing  any  one  of  the  edges  results  in  a 
nonrigid  graph  (Figure  lb).  In  (Laman,  1970),  it  is  proved 
that  every  rigid  graph  contains  a  minimally  rigid  subgraph 
with  the  same  vertex  set.  A  graph  is  termed  redundantly 
rigid ,  if  it  is  rigid  and  after  removal  of  any  edge,  it  still 
remains  rigid  (Figure  lc). 


2.2  Global  Rigidity 

A  graph  is  termed  globally  rigid,  if  any  two  of  its  real¬ 
izations  in  the  space  are  congruent  ,  i.e.  each  realization 
can  be  obtained  from  any  other  realization  only  by  a 
combination  of  reflections,  translations  and/or  rotations 
of  the  whole  graph  (Figure  lc).  The  following  theorem 
from  (Jackson  and  Jordan,  2005)  gives  a  characterization 
of  globally  rigid  graphs  in  2D. 

Theorem  1.  The  graph  G  =  (V,  E)  with  \V\  >  4  is  globally 
rigid,  if  and  only  if  it  is  3-connected  and  redundantly  rigid. 

A  graph  is  said  to  be  k- connected  if  it  is  connected  and 
after  removal  of  any  set  of  up  to  k  —  1  vertices,  it  still 
remains  connected,  see  (Nagamoclii  and  Ibaraki,  2008). 

2.3  Generalizing  Redundant  (Global)  Rigidity 

The  notion  of  redundant  rigidity  can  be  generalized  to  the 
loss  of  k  edges  and/or  k  vertices  (Yu  and  Anderson,  2009). 
A  graph  is  termed  k-edge  rigid  if  after  deletion  of  any  set 
of  up  to  k  —  1  edges,  a  rigid  graph  always  results.  With 
the  same  notion  a  graph  is  k -vertex  rigid  if  after  deletion 
of  any  set  of  up  to  k  —  1  vertices,  the  resulting  graph  is 
still  rigid.  Similarly,  a  graph  is  called  k-edge  ( k-vertex ) 
globally  rigid  if  after  deletion  of  any  set  of  up  to  k  —  1 
edges  (vertices),  a  globally  rigid  graph  always  results  (Yu 
et  al.,  2010).  The  focus  of  this  work  is  on  k- vertex  global 
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rigidity.  In  this  paper,  to  avoid  near  trivialities  and  because 
the  case  is  actually  special  in  terms  of  the  probable  results, 
we  assume  that  G  =  (V,  E)  is  not  a  complete  graph  (there 
is  an  edge  between  every  vertex  pair)  as  it  is  obvious  that 
the  complete  graph  I<i  is  ( l  —  2)-vertex  globally  rigid  and 
measuring  the  distances  between  all  pairs  of  the  vertices  in 
the  graph  (which  resulted  in  a  complete  underlying  graph) 
is  inefficient  in  practice. 

Let  us  record  a  few  properties  of  fc-vertex  globally  rigid 
graphs.  Lemma  2  is  similar  to  Corollary  2  in  (Yu  and 
Anderson,  2009),  but  considers  fc-vertex  global  rigidity 
instead  of  k- vertex  rigidity. 

Lemma  2.  If  G  =  (V,  E)  is  fc-vertex  globally  rigid,  then  it 
is  ( k  +  2)-connected.  Also,  each  vertex  v  £  V  has  degree 
at  least  k  +  2  (6g(v)  >  k  +  2,  where  6g(v)  is  the  degree  of 
vertex  v)  and  the  number  of  vertices  satisfies  the  inequality 
|V|  >  fc  +  3. 

Proof.  To  prove  ( k  +  2)-connectivity  by  obtaining  a  con¬ 
tradiction,  suppose  G  is  not  (fc  +  2)-connected.  Then  there 
exists  a  cut  set,  say  S,  where  |<Sj  <  k  +  1 .  It  is  obvious 
(from  the  definition)  that  if  \V\  <  k  —  1  then  G  cannot  be 
A;- vertex  globally  rigid.  Now  let  U  C  S  be  any  arbitrary  set 
of  vertices  with  the  condition  that  \U\  =  k  —  1.  According 
to  the  definition  of  fc-vertex  global  rigidity,  G\U  is  still 
globally  rigid.  However,  T  =  S\U  is  a  cut-set  in  G\U  and 
has  T  =  |S’\C/|  <  2  vertices.  This  means  that  G\U  is  not 
3-connected  and  therefore  G\U  is  not  globally  rigid.  This 
contradiction  implies  that  G  is  (k  +  2)-connected. 

The  ( k  +  2)-connectivity  implies  5g(v)  >  k  +  2  for  any 
v  £  V .  Otherwise,  there  is  a  set  U  of  up  to  k  +  1 
vertices  (the  neighbors  of  a  vertex  v  with  Sg(v)  <  k  +  1) 
whose  removal  makes  the  graph  disconnected.  It  follows 
that  IP  |  >  k  +  3  holds  as  there  should  be  at  least  fc  +  3 
vertices  in  V  so  that  every  vertex  has  a  degree  of  at  least 
fc  +  2.  ■ 

A  graph  is  called  minimally  k- vertex  globally  rigid,  if  it 
is  fc-vertex  globally  rigid  but  after  removing  any  one  of 
the  edges  the  resulting  graph  is  no  longer  fc-vertex  globally 
rigid.  It  is  not  hard  to  argue  that  (a)  fc-vertex  globally  rigid 
graphs  exists  for  any  positive  fc  (we  present  a  particular 
construction  in  section  3.2  below)  and  (b)  given  any  fc- 
vertex  globally  rigid  graph,  a  subgraph  can  be  obtained 
by  edge  removal  which  is  minimally  fc-vertex  globally  rigid. 
Thus  the  concept  is  not  an  empty  one. 

In  (Jordan  and  Szabadka,  2009)  it  is  shown  that  there  is 
an  operation,  called  1-extension  (or  edge- splitting),  which 
can  grow  any  globally  rigid  graph  by  1  vertex  in  2D.  In  this 
operation,  an  edge  (u,  v)  £  E  is  removed  from  E  and  a  new 
vertex  z ,  is  connected  to  both  u  and  v  and  an  arbitrary 
third  vertex  w  £  V,  w  £  {u,  v}. 

Based  on  Theorem  1.2  in  (Jordan  and  Szabadka,  2009), 
the  number  of  edges  in  a  minimally  globally  rigid  graph 
(the  graph  G  =  (V,  E)  which  is  globally  rigid  and  Ve  £  E 
G'  =  (V,E  —  e)  is  not  globally  rigid)  is: 

Lemma  3.  If  G  =  (V,  E)  is  a  minimally  globally  rigid 
graph  on  |V|  >  4  vertices,  then  \E\  =  2  |J+|  —  2  always 
holds. 

In  the  case  of  standard  global  rigidity,  the  definition  of 
minimal  global  rigidity  is  proved  (Theorem  1.2  (Jordan 


(a)  (b) 


Fig.  2.  (originally  from  (Servatius,  1989))  Examples  of  the 
2  possible  partition  of  edge  set  for  strongly  minimal 
2-vertex  rigid  graphs:  (a)  the  degree  3  vertices  are 
adjacent,  (b)  the  degree  3  vertices  are  non-adjacent 

and  Szabadka,  2009))  to  be  equivalent  to  an  alternative 
statement:  a  rigid  graph  is  called  minimally  globally  rigid  if 
it  has  the  minimum  number  of  possible  edges  (|E|  =  2  \V\  — 
2)  among  all  globally  rigid  graphs  with  the  same  number 
of  vertices.  For  fc-vertex  global  rigidity,  these  two  notions 
are  no  longer  equivalent;  there  are  some  graphs  which  are 
minimally  2-vertex  globally  rigid  but  the  number  of  edges 
is  not  the  minimum  possible  among  such  graphs  with  the 
same  vertex  count.  This  property  leads  us  to  two  different 
notions:  strongly  minimal  and  weakly  minimal  fc-vertex 
global  rigidity  (this  notion  is  adapted  from  the  notion 
of  strongly /weakly  minimal  2- vertex  rigidity  in  (Summers 
et  ah,  2008)). 

•  A  fc-vertex  globally  rigid  graph  is  said  to  be  strongly 
minimal  if  it  has  the  minimum  possible  number 
of  edges  (over  all  fc-vertex  minimally  globally  rigid 
graphs)  on  a  given  number  of  vertices. 

•  A  fc-vertex  globally  rigid  graph  is  said  to  be  weakly 
minimal  if  it  has  more  than  the  minimum  possible 
number  of  edges  on  a  given  number  of  vertices,  but 
has  the  property  that  removing  any  edge  destroys  fc- 
vertex  global  rigidity. 

The  following  results  of  (Servatius,  1989)  characterized 
the  structure  of  strongly  minimal  2-vertex  rigid  graphs 
for  the  first  time.  Due  to  their  relevance  and  later  use, 
we  restate  them  here.  Figure  2,  illustrates  examples  of 
strongly  minimal  2-vertex  rigid  graphs  with  each  of  the 
two  possible  types. 

Lemma  f.  (Lemma  1  of  (Summers  et  ah,  2008))  If  G  = 
(V,  E)  is  a  2-vertex  rigid  graph  on  5  or  more  vertices,  then 
\E\>2\V\-l. 

Theorem  5.  (Proposition  1  of  (Servatius,  1989))  Let  G  = 
(V,  E)  be  a  strongly  minimal  2-vertex  rigid  graph  on  5  or 
more  vertices.  Then  G  has  exactly  3  vertices  with  degree 
3  and  the  remaining  vertices  have  degree  4,  which  implies 
\E\=2\V\-1. 

Theorem  6.  (Theorem  3.1  of  (Servatius,  1989))  A  graph 
G  =  (V,  E)  with  |  Vj  >  5  is  strongly  minimal  2- vertex  rigid 
if  and  only  if  G  has  exactly  two  vertices  of  degree  3  and 
there  is  a  partition  of  the  edge  set  E 

E  =  Ei  U  E2  U  ...  U  Ek 

such  that  the  graph  induced  by  E\Ei  is  minimally  re¬ 
dundantly  rigid  (i.e.  the  removal  of  any  edge  destroys 
redundant  rigidity)  for  all  i,  and  either 

•  Ei  and  £2  are  the  edges  incident  to  the  two  non- 
adjacent  vertices  of  degree  3,  respectively,  and  Et  is 
a  single  edge  for  3  <  i  <  k  ,  or 

•  E\  is  the  union  of  the  edges  incident  to  the  two 
adjacent  vertices  of  degree  3,  and  Et  is  a  single  edge 
for  2  <  i  <  k. 
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Fig.  3.  A  Strongly  Minimal  2- Vertex  Globally  Rigid  graph 
of  size  10  (Cf0). 

The  following  theorem  from  (Summers  et  al.,  2008)  char¬ 
acterizes  strongly  minimal  2-vertex  globally  rigid  graphs 
(Figure  3). 

Theorem  7.  (Theorem  10  in  (Summers  et  ah,  2008))  The 
graph  G  =  (V,  E)  of  5  or  more  vertices  is  strongly 
minimal  2-vertex  globally  rigid  if  and  only  if  the  following 
conditions  hold: 

•  \E\=2\V\ 

•  G  is  4-connected 

•  G  is  redundantly  strongly  minimal  2-vertex  rigid 

A  redundantly  strongly  minimal  2- vertex  rigid  graph  is  one 
in  which  after  removal  of  any  edge,  the  result  is  strongly 
minimal  2-vertex  rigid.  Such  a  graph  is  obtained  and  can 
only  be  attained  by  joining  the  vertices  of  degree  3  in  a 
strongly  minimal  2-vertex  rigid  graph  whose  vertices  of 
degree  3  are  not  adjacent.  Figure  2  depicts  examples  of  2 
possible  configurations  for  strongly  minimal  2- vertex  rigid 
graphs.  If  in  Figure  2(b),  we  connect  the  vertices  of  degree 
3  (which  are  not  adjacent),  the  result  is  a  redundantly 
strongly  minimal  2-vertex  rigid  graph. 

3.  RESULTS 

In  this  section  we  start  by  studying  some  sufficient  con¬ 
ditions  for  fc-vertex  global  rigidity  (Section  3.1).  This  will 
enable  us  to  propose  localizable  structures  which  are  ro¬ 
bust  against  the  loss  of  up  to  k  —  1  vertices.  Then,  from 
a  different  perspective,  the  structure  of  fc-vertex  globally 
rigid  graphs  is  studied  and  some  necessary  conditions  are 
obtained  (Section  3.2).  To  elaborate  this  result,  the  case 
of  3-vertex  global  rigidity  is  considered  in  more  detail  and 
a  general  class  of  structures  which  are  strongly  minimal  3- 
vertex  globally  rigid  are  introduced  (Section  3.3).  We  con¬ 
clude  this  section  by  comparing  these  results  on  strongly 
minimal  3-vertex  global  rigidity  and  the  result  that  can  be 
obtained  by  the  sufficient  condition  in  Section  3.1. 

3.1  Sufficient  Condition  for  k-Vertex  Global  Rigidity 

The  notion  of  fc-connectivity  has  been  well  studied  in  the 
literature  and  there  are  efficient  algorithms  to  check  this 
property  on  a  given  graph  (Nagamochi  and  Ibaraki,  2008). 
The  idea  here  is  to  identify  a  (fc+jj-connectivity  condition 
which  is  sufficient  to  ensures  k-ve rtex  global  rigidity.  It 
is  proved  in  (Lovasz  and  Yemini,  1982)  that  in  2D,  6- 
connectivity  implies  rigidity  and  6  is  the  least  possible 
number  for  this  condition  (fc-connectivity  with  fc  <  6  is 
not  sufficient  for  rigidity).  Recent  extension  to  this  work 
(Jackson  and  Jordan,  2009)  showed  the  sufficiency  of  this 
condition  for  global  rigidity  as  well. 

Theorem  8.  (restatement  of  Theorem  1.2  in  (Jackson  and 
Jordan,  2009))  Suppose  G  =  (V,  E)  is  a  6-connected  graph 


in  2D.  Then  it  is  2-edge  globally  rigid  (removing  any  edge 
results  in  a  globally  rigid  graph). 

We  can  extend  this  result  to  the  case  of  fc-vertex  global 
rigidity,  as  the  following  theorem  shows: 

Theorem  9.  Assume  that  G  =  (V,  E)  is  a  (fc+5)-connected 
graph.  Then  G  is  fc-vertex  globally  rigid. 

Proof.  Let  U  C  V  be  any  set  of  vertices  with  \U\  =  fc  —  1. 
Since  G  is  (fc+5)-connected,  it  is  obvious  that  G'  =  G\U  is 
6-connected  (fc  +  5  —  (fc  —  1)  =  6).  Therefore,  G'  is  globally 
rigid.  Since  U  was  chosen  arbitrarily,  we  conclude  that  G 
is  fc-vertex  globally  rigid.  ■ 

This  result  is  very  important  as  it  reduces  the  more 
complicated  and  unfamiliar  property  of  fc-vertex  global 
rigidity  to  a  well-known  and  easier  to  study  property  of 
(fc  +  5)-connectivity.  Specifically  for  large-scale  random 
wireless  networks  (where  the  nodes  are  assumed  to  be 
Poisson  distributed  with  a  certain  density,  and  they  have 
common  transmission  range),  it  is  shown  in  (Wan  and  Yi, 
2004)  that  asymptotically,  i.e.  as  the  node  count  tends  to 
infinity,  there  is  a  relation  between  the  transmission  range 
and  a  fc-connectivity  property  (for  any  fc)  of  a  wireless 
sensor  network,  i.e.  by  increasing  the  transmission  power  of 
the  nodes  in  a  random  network  above  an  specific  threshold, 
the  network  becomes  fc-connected  (The  threshold  is  NOT 
of  the  same  order  as  the  diameter  of  the  network,  a 
situation  which  trivially  would  ensure  fc-connectivity). 
This  idea  tied  to  Theorem  9  answers  the  question  of  ”  given 
a  wireless  sensor  network,  how  can  we  make  it  robustly 
localizableT1 : 

Theorem  10.  Assume  a  network  of  wireless  sensor  nodes. 
As  the  node  count  tends  to  infinity,  there  exists  a  critical 
transmission  radius  (NOT  of  the  order  of  the  network 
diameter),  say  r,  such  that  by  increasing  the  transmission 
range  of  every  node  above  it,  the  network  becomes  fc-vertex 
globally  rigid  (robustly  localizable  against  the  loss  of  up 
to  fc  —  1  vertices). 

Finding  the  value  of  r  for  various  fc  is  beyond  the  scope  of 
this  paper. 

3.2  Necessary  Condition  for  k-Vertex  Global  Rigidity 

In  this  subsection  a  necessary  condition  on  fc-vertex  glob¬ 
ally  rigid  graphs  is  provided  which  is  a  lower  bound  on 
the  number  of  edges  these  graphs  have.  A  prerequisite  of 
the  main  theorem  is  to  show  that  for  fixed  fc,  there  always 
exists  a  fc-vertex  globally  rigid  graph  of  arbitrary  size  in 
which  the  number  of  edges  depends  linearly  on  the  number 
of  vertices. 

Theorem  11.  For  ambient  space  dimension  of  d  =  2  and  3, 
there  exists  a  fc-vertex  globally  rigid  graph  G  =  (V,E) 
with  |Vj  >  fc  +  3  and  otherwise  arbitrary,  for  which 
\E\  =  a\V\  +  b  holds,  for  some  a  and  b  dependent  on  fc 
and  d  but  independent  of  |Vj. 

Proof.  The  proof  is  by  constructing  a  fc-vertex  globally 
rigid  graph  which  satisfies  the  conditions  of  the  theorem. 
First,  observe  that  a  complete  graph  Kk+d+i  is  fc-vertex 
globally  rigid.  The  number  of  edges  in  K^+d+i  is  m  = 

\V(Kk+d+i)\  =  (fc+d)(2+d+1)-  Suppose  Gi  =  (Vi, £70  and 
G2  =  {V2,E2)  are  two  Kk+d+i  graphs.  It  is  easy  to  show 
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that  by  adding  a  set  L  of  fc  +  d  +  1  edges  between  G\  and 
Gn ,  such  that  every  vertex  in  G\  is  connected  to  exactly  1 
vertex  in  G2  and  conversely,  G  =  (Vi  U  V2,  E\  U  E2  U  L)  is 
also  fc-vertex  globally  rigid. 

More  generally,  to  construct  a  fc-vertex  globally  rigid  graph 
G  =  (V,E)  on  n  =  \  V\  vertices  imagine  a  series  of  Kk+d+i 
graphs  {Gi  =  (V),  £’i)}i=i..(p-i)  iR  which  Gj  is  connected 
to  Gi+i  by  fc  +  d  +  1  edges,  say  Here,  p  =  |_^J  +  1 
and  |V(GP)|  =  n  —  pm.  <  m.  For  Gp,  which  may  have 
fewer  than  fc  +  d  +  1  vertices,  connect  every  vertex  to  all 
vertices  of  Gp_  1  (by  k  +  d+1  edges),  which  constitute  the 
set  Lp-i  of  edges  between  Gp_-  1  and  Gp.  It  is  easy  to  show 
that  G  =  (V,E)  =  (Vill...l>Vp,E1U...l>EpUL1U...ULp_i) 
is  fc-vertex  globally  rigid.  Hence,  the  inequality  |2£|  <  (p  — 
l)m  +  (p—l)(k  +  d+l)  +  (k  +  d+l)m  holds.  With  fc  fixed, 
0(\E\)  =  0(pm)  =  0(n)  holds,  i.e.  the  number  of  edges  is 
linear  with  respect  to  the  number  of  vertices.  ■ 

Theorem  12.  In  a  strongly  minimal  fc-vertex  globally  rigid 
graph  the  edge  count  is  under-bounded  by  the  formula 
\E\  >  +  c(fc),  where  c(fc)  is  an  integer  (c  is 

independent  of  \V\  but  depends  on  fc)  and  if  the  equality 
holds  (i.e-l-E’l  =  [^±2  |yj"|  _|_  c(fc)),  then  c(fc)  >  0. 

Proof.  Assume  that  G  =  (V)  E)  is  a  strongly  minimal  fc- 
vertex  globally  rigid  graph  of  \V\  >  k  +  3  vertices,  whose 
number  of  vertices  is  |£'|  =  a  |H|  +c(fc)  (according  to  The¬ 
orem  11  such  a  graph  exists),  where  c(fc)  is  independent  of 
|V|.  According  to  Lemma  2,  Si  >  fc  +  2  holds.  Therefore, 
the  average  degree  in  G  is  Savg  >  fc  +  2.  On  the  other 
hand,  5avg  =  yyy  =  2a  +  Hence,  2 a  +  >  k  + 

2  =>  k  <  2(a  —  1)  +  Ipp. 

Since  the  property  must  hold  for  graphs  of  arbitrary  size 
and  in  particular  arbitrarily  large  \V\,  assuming  \V\  > 
2 (c(fc),  we  will  have  fc  <  2(a  —  1)  or  a  >  |  +  1.  Therefore, 
\E\  >  f(|  +  1)  \V\]  T  c(fc)  holds  for  |V|  >  2c(fc). 

Now  suppose  that  for  some  \V\  the  equality  holds  (i.e. 
\E\  =  [^±2  |F|]  +c(fc)).  We  prove  that  for  such  a  strongly 
minimal  fc-vertex  globally  rigid  graph  c(fc)  >  0  always 
holds. 


First  suppose  fc  is  even.  Then,  we  will  have  SaVg  =  k  +  2  + 
>  fc  +  2,  which  implies  c(fc)  >  0. 


If  fc  is  odd,  then  \E\  =  \V\ 


\E\  <  *±i|H| 

2  c(k) 


c(k ) 


2 

M 

% 


2 


■  c(fc)  which  gives 


1.  Therefore,  5avg  <  k  +  1  + 


2  c(fc) 


pp  + 1  +  py  holds.  This  implies  fc  +  2  <  fc  +  2+-p 
which  gives  —1  <  c(fc)  and  so  c(fc)  >  0  holds. 


2 


Note  that  the  underbound  of  Theorem  12  is  achieved  for 
fc  =  2,  see  Theorem  7.  Below,  we  show  it  is  also  achievable 
for  fc  =  3. 


3.3  Strongly  Minimal  3-vertex  Global  Rigidity  in  2D 

In  this  section  we  provide  a  class  of  graphs  that  are 
strongly  minimal  3-vertex  globally  rigid.  According  to 
the  important  result  of  Theorem  12,  for  the  case  of  3- 
vertex  global  rigidity,  the  number  of  edges  must  satisfy 
the  inequality  \E\  >  |"|  |F|]  +  c.  In  order  to  characterize 
a  strongly  minimal  3-vertex  globally  rigid  graphs,  we 


a  b 

Fig.  4.  (a)  A  Strongly  Minimal  3- Vertex  Globally  Rigid 
graph  of  size  10  (i?io).  (b)  A  Strongly  Minimal  3- 
vertex  globally  rigid  graph  of  size  11  (7?n). 

conjecture  that  the  number  of  edges  has  the  form  \E\  = 
|" |  |V|]  +  c  and  then  try  to  find  a  suitable  value  for  c.  We 
also  assume  that  the  number  of  vertices  is  at  least  6.  Since 
we  are  seeking  graphs  with  a  minimum  number  of  edges, 
it  is  reasonable  to  keep  c  at  its  minimum  (0)  and  try  to 
find  a  graph  which  has  the  desired  property. 


If  |V|  is  even,  i.e.  \V\  =  2 n,  the  equation  implies  |I?|  = 

|  \V\  and  5avg  =  5.  According  to  Lemma  2,  S(v)  >  5. 
Therefore,  the  graph  we  are  seeking,  is  5-regular,  i.e.  every 
vertex  has  degree  5.  Fortunately,  there  is  a  class  of  graphs 
with  these  conditions  that  are  3-vertex  globally  rigid.  An 
example  of  such  a  graph  with  \V\  =  10  is  shown  in  Figure 
4a,  and  the  structure  carries  over  in  an  obvious  way  to 
graphs  with  2 n  vertices  for  n  >  3.  The  structure  of  such  a 
graph,  call  it  i?2n,  is  as  follows: 

The  graph  is  formed  from  a  C2n  cycle  [si,  S2,  ••••S2n]>  to 

which  are  added  the  edges  S1S3,  S2S4,  S3S5, . S2n-iSi,  S2nS2 

(forming  2  cycles  of  size  n  denoted  by  G^^and  Cn\  re¬ 
spectively)  and  the  edge  set  D  of  edges  SiSi+n,i  =  l..n 

(called  diagonals).  Note  that  the  cycles  G„  ^ancl  Cn  are 
disjoint. 

It  needs  to  be  proved  that  the  proposed  graph  is  3- vertex 
globally  rigid.  The  proof  has  two  steps:  first,  showing  the 
graph  is  5-connected,  and  second,  after  removal  of  any  2 
vertices,  showing  the  result  is  redundantly  rigid.  Proofs 
are  omitted  due  to  space  limitation. 

Theorem  13.  The  graph  I?2n  on  6  or  more  vertices,  de¬ 
picted  in  Figure  4a,  is  5-connected. 

Theorem  14-  The  graph  i?2ra  on  6  or  more  vertices,  is  3- 
vertex  globally  rigid. 

To  obtain  a  class  of  strongly  minimal  3-vertex  globally 
rigid  with  an  odd  number  of  vertices,  i.e.  \V\  =  2n+  1,  we 
define  an  operation,  termed  2-extension,  over  i?2n  graphs 
to  increase  |  V\  by  1.  We  show  that  this  operation  preserves 
3-vertex  global  rigidity.  The  sketch  of  2-extension  is  as 
follows: 

Assume  vertices  s,;,z  =  1..4  to  be  4  consecutive  vertices 
in  the  C2n  cycle  as  depicted  in  Figure  4b.  The  diagonal 
neighbor  of  Si  is  called  sra_|_i.  The  operation  consists  of 
adding  a  new  vertex,  say  v,  connecting  it  to  Sj,  i  = 
1..4,  (n  +  1)  and  removing  (si,  S4)  and  (s2>  S3). 

Let  R2n+i  be  the  graph  obtained  from  i?2n  by  ap¬ 
plying  the  2-extension  operation  (Figure  4b).  Evidently, 
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\E(R2n+1)\  =  \E(R2n)\  +  3  =  |(2  n)  +  3  =  §(2n  +  1)  +  \ 
holds  which  is  consistent  with  the  edge  count  condition 
(\E(R2n+1)\  =  [§  \V(R2n+})\]  =  l(\V(R2n+1)\  -  1)  +  §). 
Hence,  R2n+i  has  the  minimum  number  of  edges  and  by 
proving  that  R2n+i  is  3-vertex  globally  rigid,  it  follows 
that  it  is  strongly  minimal  3-vertex  globally  rigid. 

Theorem  15.  The  2-extension  operation  applied  to  R2n 
preserves  3-vertex  global  rigidity. 

Proof.  The  proof  is  similar  to  the  proof  of  Theorem 
14.  The  method  is  to  consider  all  possible  choices  of  U, 
U  G  V(R2n+i)  and  \U\  =  2,  and  show  that  R2n+i\U 
is  globally  rigid.  We  omit  the  proof  due  to  the  space 
limitation.  ■ 

3-4  Comparison  of  Results 

Table  1  shows  the  result  of  comparing  the  structures 
suggested  for  3- vertex  globally  rigid  graphs  in  Section  3.1 
(8-connected)  and  Section  3.2  (strongly  minimal).  It  is 
easy  to  see  that  the  strongly  minimal  3-vertex  globally 
rigid  structure  is  at  least  8"g)i5"  =  |  =  37.5%  more 
efficient  to  localize  the  network  (in  terms  of  the  minimum 
number  of  required  distance  measurements  to  achieve 
the  specified  tolerance  to  vertex  loss),  than  the  structure 
suggested  based  on  connectivity.  Of  course,  there  will  be 
other  redundancies  in  the  larger  network  which  are  not 
present  in  the  strongly  minimal  network. 

Table  1.  Comparison  of  the  minimum  number 
of  required  distance  measurements  for  the  sug¬ 
gested  structures  with  n  vertices 

Structure  Min  #  of  Dist.  Measurements 

8-Connectivity  8  n 

S.M.  3-V-G  Rigid  5 n 


4.  CONCLUSION 

In  this  paper  we  studied  the  structure  of  localizable  sensor 
networks  which  are  tolerant  against  the  loss  of  multiple 
nodes  (the  network  remains  localizable  after  the  loss  of 
multiple  nodes).  This  is  done  by  introducing  the  notion 
of  fc-vertex  globally  rigid  graphs  in  which  after  removal 
of  any  set  of  up  to  k  —  1  vertices,  the  resulting  network 
still  remains  globally  rigid  (localizable).  For  2D  networks, 
we  showed  that  a  graph  is  fc-vertex  globally  rigid  if  it  is 
( k  +  5)-connected.  For  networks  modeled  by  a  random 
geometric  graph,  this  reduction  may  enable  us  propose  a 
critical  transmission  radius  r  for  the  nodes,  above  which 
the  network  becomes  /c-vertex  globally  rigid.  Furthermore, 
By  providing  a  lower  bound  on  the  edge  count  of  k- vertex 
globally  rigid  graph  in  terms  of  the  vertex  count,  we  also 
proposed  a  class  of  3-vertex  globally  rigid  graphs  with 
minimum  edge  count.  Comparisons  showed  a  considerable 
improvement  achieved  by  this  class  over  the  8-connectivity 
condition,  in  terms  of  the  required  number  of  distance 
measurements.  This  suggests  that,  there  is  certainly  a 
benefit  in  studying  the  structure  of  strongly  minimal 
networks  for  a  given  k  (and  for  k  >  3  this  remains  to 
be  done).  Indeed,  the  full  characterization  of  (strongly 
minimal)  fc-vertex  globally  rigid  graphs  in  addition  to 
efficient  algorithms  to  test  this  property  on  a  given  graph 
are  still  open  problems  that  constitute  our  future  work. 
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Abstract — The  ability  to  localize  a  sensor  network  is  impor¬ 
tant  for  its  deployment.  A  theoretical  result  exists  defining  nec¬ 
essary  and  sufficient  conditions  for  network  unique  localizabil¬ 
ity  (for  inter-sensor  range-based  localization);  it  has  its  roots  in 
Graph  Rigidity  Theory  where  sensors  and  links/measurements 
are  modelled  as  vertices  and  edges  of  a  graph,  respectively. 
However,  critical  missions  do  require  a  level  of  robustness 
for  localizability,  ensuring  that  localizability  is  retained  in  the 
event  of  link  (edge)  losses  and/or  sensor  (vertex)  losses.  This 
work  characterizes  this  robustness  through  a  novel  notion  of 
redundant  localizability,  which  is  backed  by  redundant  rigidity. 
Analogously  to  two  well-known  types  of  result  for  rigidity  char¬ 
acterization,  similar  results  are  developed  for  edge  redundant 
rigidity;  they  are  supplemented  by  rather  fewer  results  dealing 
with  vertex  redundant  rigidity.  These  preliminary  results  may 
shed  a  light  for  any  further  study  of  redundant  localizability. 

Index  Terms — Rigidity  Theory,  Network  Localizability,  Sen¬ 
sor  Network  Localization 

I.  Introduction 

Obtaining  location  information  is  a  fundamental  task  in 
a  sensor  network,  since  otherwise  the  sensed  data  will 
become  much  less  valuable.  The  location  of  every  sensor 
node,  if  not  already  known  from  the  deployment  of  the 
network  or  directly  from  GPS,  can  only  be  determined  from 
a  process  based  on  measurements,  the  network  structure  and 
partially  known  location  information.  This  process  is  referred 
to  as  localization,  and  the  network  property  that  governs 
the  feasibility  of  localizing  the  entire  network  given  the 
measurement  is  called  localizability. 

Most  localization  algorithms  are  based  on  range- 
measurements,  or  measurements  that  can  be  converted  to 
inter-node  distance  measurements. 

Aspens  et  al  [2]  formally  prove  that  for  a  2D  localization 
problem,  a  necessary  and  sufficient  condition  for  localizabil¬ 
ity  given  inter-node  distance  measurements  is  that  a  network 
graph  must  be  globally  rigid  (the  concept  is  reviewed  below) 
and  at  least  three  noncollinear  sensors  must  have  known 
location  information  (hence  they  are  anchors). 

Having  just  global  rigidity  may  be  unrealistic  in  practical 
scenarios,  as  not  only  can  the  localization  algorithms  demand 
very  high  computational  complexity,  but  also  because  the 
global  rigidity  property  (hence  localizability)  can  easily  be 
lost  if  some  node  or  measurement  becomes  unavailable, 
manifesting  some  type  of  node/link  failure  in  the  network. 
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The  first  problem  has  attracted  great  interest  among  re¬ 
searchers.  Anderson  et  al  [1]  discussed  various  graphical 
properties  of  easily  localizable  sensor  networks.  A  main 
result  is  that  by  doubling  or  tripling  the  sensing  radius, 
the  new  network  graph  acquires  special  properties  that  pro¬ 
vide  guarantees  on  the  computational  complexity,  sometimes 
linear  in  the  number  of  sensors.  In  [14],  limitations  of 
classical  trilateration  algorithms  are  analyzed,  proving  their 
insufficiency  in  even  recognizing  the  localizability  of  a  graph. 
A  novel  localization  method  generalizing  trilateration  is 
proposed  based  on  aggregated  knowledge  of  the  subnetwork 
formed  by  all  1-hop  neighbors  of  each  node  and  the  node 
itself;  roughly  speaking,  the  induced  subgraph  of  the  neigh¬ 
bors  and  the  node  is  actually  a  wheel  graph  that  is  globally 
rigid. 

The  second  problem,  viz.  ensuring  tolerance  of  node 
or  link  failure,  has  been  rarely  addressed.  Recent  study 
of  redundant  rigidity  [13]  has  started  to  shed  light  on  a 
new  direction,  using  a  graph  theoretical  approach.  It  raises 
the  question,  pursued  in  depth  in  this  paper:  what  are  the 
conditions  for  preserving  localizability  in  the  event  of  loss 
of  up  to  p  link  length  measurements  and  q  nodes  (together 
with  all  their  associated  distance  measurements )? 

This  work  studies  the  level  of  redundancy  that  can  be  built 
into  localizability,  such  that,  the  network  is  guaranteed  to  be 
localizable  when  the  loss  of  nodes  and/or  links  is  allowed, 
up  to  a  certain  maximum  number  in  each  case.  As  the  result 
is  more  of  a  fundamental  characterization  than  a  practical 
algorithm,  the  typical  problems  in  practical  localization, 
such  as  dealing  with  noisy  measurements,  characterization  of 
RMS  error  and  bias  in  position  estimates,  error  propagation, 
and  computational  complexity,  are  not  considered  and  remain 
for  the  future. 

The  rest  of  this  paper  is  organized  as  follows.  The  prelimi¬ 
naries,  especially  relevant  results  of  graph  rigidity  theory,  are 
introduced  in  Section  II.  The  connections  between  redundant 
rigidity,  redundant  connectivity,  global  rigidity  and  redundant 
localizability  are  established  in  Section  III.  Based  on  these. 
Section  IV  discusses  the  relationship  between  vertex  and 
edge  redundancy.  Two  characterizations  of  (l,p) -rigidity,  a 
concept  associated  with  the  loss  of  links,  are  then  given  in 
Section  V  and  Section  VI,  respectively.  Section  VII  sets  out 
the  relationship  between  redundant  connectivity  and  ( 1 ,  p)- 
rigidity  on  the  one  hand  and  redundant  global  rigidity  and 
localizability  on  the  other.  Finally,  the  conclusions  and  pro¬ 
posals  for  future  work  are  presented.  Due  to  space  limitation, 
all  proofs  are  omitted  and  can  be  found  in  an  extended 
version  by  a  request  to  the  first  coauthor. 
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II.  Preliminaries 

We  will  model  a  network  as  an  undirected  graph  G  = 
(V,  E)  where  the  nodes  are  elements  of  the  vertex  set  V, 
and  an  edge  e,7  £  E  if  the  distance  between  nodes  i  and 
j  is  available.  In  the  sequel  for  X  C  V,  iG ( X )  will  be  the 
number  of  edges  in  the  graph  induced  by  the  vertex  set  X  in 
G.  Further,  with  Vj  C  V,  the  edge  set  in  the  graph  induced  by 
the  vertex  set  Vi  in  G  will  be  denoted  by  EG(V)-  Similarly, 
with  Ei  C  E.  the  vertex  set  in  the  graph  induced  by  the  edge 
set  Ei  in  G  will  be  denoted  by  VG{Ei). 

As  noted  in  the  introduction  a  result  due  to  [2]  proves  that 
a  two-dimensional  network  G  =  ( V,  E)  is  localizable  iff  it 
is  globally  rigid  and  has  at  least  three  noncollinear  anchor 
nodes.  Thus  in  section  II-A  we  provide  some  preliminary 
information  about  rigidity  and  global  rigidity. 

In  subsection  II-B  we  provide  two  alternative  though, 
equivalent  characterizations  of  rigidity.  The  first,  a  celebrated 
result  due  to  Laman,  [10]  involves  edge  counts  on  subgraphs 
induced  by  components  of  vertex  partitions.  The  second,  pri¬ 
marily  due  to  Lovasz  and  Yemeni  [11]  involves  vertex  counts 
on  subgraphs  induced  by  components  of  edge  partitions. 

Subsequently,  in  Section  III  we  provide  pertinent  redun¬ 
dancy  definitions  and  an  important  linkage  between  redun¬ 
dant  rigidity,  redundant  connectivity  and  global  rigidity,  that 
sets  up  the  remainder  of  the  paper. 

A.  Rigidity  and  Global  Rigidity 

In  this  subsection,  we  recall  briefly  the  notions  and  some 
properties  of  (minimal)  rigidity  and  global  rigidity.  Our 
description  will  be  largely  based  on  graphs  that  model  the 
network  under  consideration.  Formal  definitions  of  graph 
rigidity  and  global  rigidity,  which  involve  the  theory  of  graph 
representations,  can  be  found  in,  e.g.,  [4],  [5],  [12], 

A  graph  is  rigid  if  the  only  edge-length  preserving  smooth 
motions  in  a  generic  network1  modelled  by  the  graph  are 
translation  and  rotation,  thus  only  result  in  congruent  graphs. 
A  graph  is  minimally  rigid  if  it  is  rigid  and  no  single  edge 
can  be  removed  without  losing  rigidity. 

Close  inspection  of  the  rigidity  definition  reveals  that  the 
consideration  is  made  for  smooth  motions.  Under  discontin¬ 
uous  transformation  of  a  network  corresponding  to  a  rigid 
graph,  it  is  possible  to  have  a  non-congruent  network  that 
preserves  the  edge-lengths  due  to  flip  and/or  flex  ambiguity. 

To  eliminate  these  ambiguities  and  to  uniquely  determine 
the  relative  positions  of  any  vertex  given  the  set  of  dis¬ 
tance  measurements  (corresponding  to  the  edge-length  set), 
a  stronger  definition  is  desirable.  A  graph  is  globally  rigid 
if  any  two  realizations  of  a  network  (with  prescribed  edge 
lengths  and  modelled  by  the  graph)  of  the  one  edge-length  set 
are  congruent,  i.e.,  differ  at  most  by  translation,  rotation  or 
reflection.  We  refer  the  reader  to  [4],  [8]  for  source  material 
on  global  rigidity. 

1  The  graph  theory  literature  generally  uses  a  different  term  than  network 
in  discussing  rigidity,  the  word  framework  being  commonly  employed. 
Because  of  this  paper’s  connection  with  sensor  networks,  we  retain  the 
word  network  below. 


B.  Rigidity  Characterization 

We  provide  now  two  different  sets  of  necessary  and 
sufficient  conditions  for  a  graph  to  be  rigid.  The  first  due 
to  Laman,  [10]  involves  edge  counts  on  graphs  induced  by 
vertex  subsets. 

Theorem  2.1:  [Laman’s  Theorem]  A  graph  G  =  (V,  E) 
is  minimally  rigid  iff  \E\  =  2\V\  —  3  and  for  all  X  C  V, 
iG(X)  <  2\X\  —  3.  The  graph  is  rigid  iff  there  is  an  E'  C  E, 
such  that  G'  =  (V,  Er)  is  minimally  rigid. 

In  the  sequel,  we  will  also  need  the  following  result  that 
presents  Henneberg  Operations  for  growing  a  rigid  graph  [7]. 

Theorem  2.2:  [Vertex  Addition]  Consider  a  graph  G  = 
(V,  E),  a  node  k  ^  V  and  two  edges  e-ik  and  e.jk,  with 
i  and  j  elements  of  V.  Then  G  is  (minimally)  rigid  iff 
(V\Jk,E{J{eik,ejk}  is  (minimally)  rigid. 

Theorem  2.3:  [Edge  Splitting]  Consider  a  graph  G  = 
(V,E),  a  node  t  (f  V  and  three  edges  eti  etJ-  and  etk, 
with  i  j  and  k  elements  of  V  and  ei:l  element  of  E.  Then 
G  is  (minimally)  rigid  if  {V{Jt,E\eij\J{eu,etj,etk}  is 
(minimally)  rigid. 

There  are  efficient  algorithms  for  checking  the  edge  count 
condition  underlying  Laman’s  theorem  [9], 

The  second  necessary  and  sufficient  condition  due  to 
Yemeni  and  Lovasz  [11],  involves  vertex  counts  on  graphs 
induced  by  edge  subsets.  Though  less  succinctly  phrased, 
it  may  be  easier  to  check  than  Laman’s  edge  count  result, 
and  references  to  algorithms  for  checking  the  condition  are 
provided  in  [11].  Jackson  and  Jordan  [8]  provide  an  elegant 
interpretation  of  this  result  using  the  theory  of  matroids. 

Definition  2.1:  Consider  a  graph  G  =  (V,  E)  and  Ei  C 
E.  A  set  of  subgraphs  P  =  {Gi  =  (VG(Ei),  Ei)}  is  termed 
an  m-Admissible  Decomposition  (AD)  of  G  if  the  following 
hold. 

0)  |^|  >0. 

(ii)  (Ji  Ei  =  E. 

(iii)  P  has  at  least  to  elements. 

If  to  =  1,  then  P  is  simply  called  an  AD  of  G. 

Next  we  define  the  index  of  an  ?n-AD  that  reflects  an 
associated  vertex  count. 

Definition  2.2:  Consider  a  graph  G  —  ( V,  E)  and  Ei  C 
E.  We  call  r(P)  the  index  of  an  to-AD  P  =  {Gi  = 
(Vc(Ei),  Ei) }£_i  of  G  where  k  >  to  is  defined  as 

k 

r(P)=^(2\VG(Ei)\-3).  (HI) 

i—1 

To  state  the  result  we  need  one  more  definition. 

Definition  2.3:  Consider  a  graph  G  =  ( V ,  E)  and  Ei  C 
E.  For  a  given  to,  an  to-AD  P  =  {Gi  =  (VG(Ei) ,  £))} -L-l 
of  G,  k  >  to  is  termed  m-minimizing  if  r(P)  is  the  smallest 
among  the  indices  of  all  possible  m-AD’s  of  G.  Such  an 
index  will  be  denoted  as  rm(G). 

Now  we  provide  the  promised  second  characterization  of 
rigidity  that  follows  from  results  in  [8],  [11]  and  [6], 

Theorem  2.4:  A  graph  G  is  rigid  iff  ri(G)  =  2\V\  —  3. 
Further,  the  edge  sets  Ei  underlying  any  minimizing  admis¬ 
sible  decomposition  are  disjoint. 
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Observe  one  cannot  say  that  ri(G)  =  2\V\  —  3  implies 
minimal  rigidity.  Indeed  consider  the  graph  G  =  K4.  Then 
with  P  =  {G},  r(G)  =  2\V\  —  3,  even  though  G  is  not 
minimally  rigid. 

We  conclude  this  section  with  an  associated  definition 
from  [8]  and  two  results,  one  from  [8],  that  will  assist  us 
in  later  development. 

Definition  2.4:  Consider  a  graph  G  =  (V,E).  Suppose  S 
is  a  non-empty  subset  of  E,  and  H  is  the  subgraph  induced 
by  the  edge  set  S.  Then  S  is  an  independent  subset  of  E  if 
iH(X)  <  2\X\  -  3  for  all  X  c  VG(S),  as  long  as  \X\  >  2. 
The  null  set  is  also  an  independent  subset  of  E. 

The  following  result  is  a  direct  consequence  of  Laman’s 
theorem. 

Lemma  2.1:  Suppose  a  graph  G  =  (V,E)  is  minimally 
rigid.  Then  for  any  S  C  E,  S  is  a  maximally  independent 
subset  of  S  in  the  subgraph  of  G,  induced  by  S. 

The  final  result  in  this  section  is  a  translation  of  Lemma 
2.4  from  [8]. 

Lemma  2.2:  Consider  a  graph  G  =  (V.  E)  with  |2£|  >  1. 
Suppose  S  C  E  is  a  maximally  independent  subset  of  E. 
Then  n(G)  =  \S\. 

III.  Redundant  Global  Rigidity 

Generally,  there  are  two  types  of  redundancy:  those  involv¬ 
ing  loss  of  edges,  and  those  involving  the  loss  of  vertices. 
As  a  generalization  we  work  here  with  mixed  redundancy. 
We  begin  with  definitions  of  mixed  redundant  rigidity  and 
mixed  redundant  connectivity. 

Definition  3.1:  A  graph  G  =  (V,E)  is  (<7,  j?)-rlgid  if  for 
all  0  <  l  <  q  —  1  and  0  <  k  <  p  —  1,  the  induced  subgraph 
obtained  by  removing  any  l  vertices  and  k  edges  is  rigid. 

In  particular,  a  ( q ,  l)-rigid  graph  is  what  has  been  defined 
in  [13]  as  q-vertex  rigid,  or  simply  g-rigid.  Similarly  a 
(l,p)-rigid  graph  is  known  as  p-edge  rigid.  There  is  also 
a  comparable  definition  for  redundant  connectivity. 

Definition  3.2:  A  graph  G  =  (V,E)  is  (g,p)-connected 
if  for  all  0  <  l  <  q  —  1  and  0  <  k  <  p  —  1,  the  induced 
subgraph  obtained  by  removing  any  l  vertices  and  k  edges 
is  connected. 

These  definitions  bring  us  to  our  first  new  result,  aimed 
at  tightening  them.  Specifically,  we  observe  below  that  these 
definitions  are  stronger  than  is  needed.  Indeed  it  is  clear  from 
Laman’s  theorem  that  if  a  graph  is  not  rigid  then  the  removal 
of  a  further  edge  cannot  make  it  rigid.  The  same  fact  applies 
to  connectivity.  Thus  it  is  possible  to  alter  the  definition  of 
(l,p)-rigidity/connectivity  to  simply  require  that  the  graph 
remain  rigid/connected  after  the  removal  of  any  p—  1  edges, 
as  opposed  to  up  to  p-1  edges  as  required  by  the  foregoing 
definitions. 

On  the  other  hand,  it  is  entirely  possible  for  a  nonrigid 
or  a  disconnected  graph  to  gain  rigidity  or  connectivity  after 
losing  vertices.  Thus  consider  a  rigid  graph  (V,E).  Suppose 
v  (j  V  and  let  e  be  an  edge  connecting  v  to  a  vertex  in  V. 
Then  it  is  easy  to  see  that  the  graph  (V|J{'l,};-£'U{e})  is 
not  rigid. 


Similarly  if  the  graph  (V,  E)  is  connected,  the  graph 
(^U{L’}i-®)  is  not  connected  as  the  vertex  v  is  isolated 
in  this  augmented  graph. 

We  now  assert  that  in  fact,  barring  small  graphs,  even  with 
g- vertex  rigidity  it  suffices  to  check  if  rigidity  is  retained  after 
deleting  any  q  —  1-vertices. 

Theorem  3.1:  A  graph  G  =  (V,  E),  with  \V\  >  q  +  1  is 
(g,p)-rigid  iff  it  is  rigid  after  removing  any  q  —  1  vertices 
and  p  —  1  edges. 

A  similar  result  obtains  for  redundant  connectivity. 

Theorem  3.2:  A  graph  G  =  (V,  E),  with  \  V\  >  q  is  ( q,p )- 
connected  iff  it  is  connected  after  removing  any  <7  — 1  vertices 
and  p  —  1  edges. 

We  now  recount  a  result  from  [8]  that  ties  connectivity 
and  rigidity  to  global  rigidity. 

Theorem  3.3:  A  graph  G  =  (V,  E)  is  globally  rigid  iff  it 
is  (l,2)-rigid  and  (3, 1  [-connected. 

We  now  define  redundant  global  rigidity,  which  as  noted 
in  section  II  is  equivalent  to  redundant  localizability  given 
three  or  more  noncollinear  anchors. 

Definition  3.3:  A  graph  G  =  ( V,E )  is  (g,p)-globally 
rigid  if  it  is  globally  rigid  and  for  all  0  <  l  <  q  —  1  and 
0  <  k  <  p  —  1,  the  induced  subgraph  obtained  by  removing 
any  l  vertices  and  k  edges  is  globally  rigid. 

In  view  of  theorems  3. 1-3.3  one  has  the  following  result. 

Theorem  3.4:  A  graph  G  =  (V,E),  with  \V\  >  q  +  1  is 
(g,p)-globally  rigid  iff  it  is  rigid  after  removing  any  q  —  1 
vertices  and  p  edges  and  is  connected  after  removing  any 
<7  +  1  vertices  and  p  —  1  edges. 

Thus  redundant  global  rigidity  has  two  components:  Re¬ 
dundant  rigidity  and  redundant  connectivity. 

Finally,  we  come  to  redundant  localizability. 

Definition  3.4:  A  sensor  network  with  a  given  set  of  inter¬ 
node  distance  measurements  is  (q,  p)  redundantly  localizable 
if  it  remains  localizable  after  removing  any  q  —  1  nodes  and 
any  p  —  1  edges. 

Since  a  network  is  localizable  if  and  only  if  it  is  glob¬ 
ally  rigid  and  has  three  or  more  noncollinear  anchors,  the 
previous  theorem  immediately  yields: 

Theorem  3.5:  A  two-dimensional  sensor  network  with  a 
given  set  of  anchors  and  inter-node  distance  measurements 
and  at  least  q  +  2  nodes  is  (q,p)  redundantly  localizable  if 
and  only  if  its  associated  graph  is  rigid  after  removing  any 
q —  1  vertices  and  p  edges,  connected  after  removing  any  g+2 
vertices  and  p  —  1  edges,  and  the  network  retains  three  or 
more  noncollinear  anchors  after  removing  any  <7—  1  vertices. 

The  rest  of  the  paper  is  largely  concerned  with  the 
characterization  of  the  redundant  rigidity. 

IV.  Connections  between  vertex  and  edge 

REDUNDANCY 

In  this  section  we  summarize  relationships  between  vertex 
and  edge  redundancy.  The  uniform  message  is  that  vertex 
redundancy  conditions  are  stronger  than  their  edge  counter¬ 
parts. 

First  we  present  a  result  concerning  redundant  rigidity. 
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Theorem  4.1:  A  graph  G  =  (V,  E),  with  \V\  >  p  +  q  +  3 
that  is  (q  +  s,p)-rigid  for  some  s  >  0,  is  (q,p  +  s)-rigid. 

Next  we  address  connectivity. 

Theorem  4.2:  A  (<7,p)-connected  graph  is  (q  —  s,p  +  s)- 
connected  for  all  0  <  s  <  q. 

We  conclude  this  section  by  the  following  theorem  that 
directly  follows  from  Theorems  4.1  and  4.2,  together  with 
the  characterization  of  redundnant  global  rigidity  in  terms  of 
redundant  rigidity  and  connectivity  of  Theorem  3.4. 

Theorem  4.3:  A  graph  G  =  (V,  E),  with  \V\  >  p+  q  +  3 
that  is  (g  +  s,p)-globally  rigid  for  some  s  >  0,  is  (q,p  +  s)- 
globally  rigid. 

V.  Redundant  Edge  Rigidity:  The  Laman 
Approach 

This  section  focuses  on  seeking  a  Laman  type  necessary 
and  sufficient  condition  for  (l,p) -rigidity.  First  we  provide  a 
natural  definition  of  minimal  (l,p frigidity  generalized  from 
the  standard  notion  of  (nonredundant)  minimal  rigidity,  i.e. 
when  p=  1. 

Definition  5.1:  A  graph  G  =  (V,E)  is  minimally  (1  ,p)- 
rigid  if  it  is  (l,p)-rigid  and  loses  rigidity  after  the  removal 
of  any  set  of  p  edges. 

It  is  not  immediately  clear  that  minimally  (l,p) -rigid 
graphs  actually  exist  for  p  >  1.  It  might  be  that  a  (l,p) -rigid 
graph  necessarily  has  some  sets  of  p  edges  whose  removal 
renders  the  graph  nonrigid,  and  some  sets  of  p  edges  whose 
removal  leaves  the  graph  rigid.  This  is  indeed  the  case  for 
p  >  2,  see  Lemma  5.1  below  (though  we  do  not  prove  this 
here);  however  for  p  =  2,  the  definition  is  meaningful,  [8], 

We  first  provide  a  necessary  and  sufficient  condition  for 
minimal  (1,  2)-rigidity  [8], 

Theorem  5.1:  A  graph  G  =  (V,E)  is  minimally  (1,2)- 
rigid  if  and  only  if  both  the  conditions  below  hold: 

(a)  \E\  =  2\V\  -  2. 

(b)  For  any  X  C  V,  1  <  |X|  <  \V\,  iG{X)  <  2\X\  -  3. 

This  result  is  in  fact  quite  powerful,  in  that  beyond  a 

precise  edge  count,  all  it  requires  is  that  Laman’s  condition 
be  satisfied  by  the  graph  induced  by  any  proper  subset  of  V. 

Rigidity,  as  opposed  to  minimal  rigidity,  is  characterized 
through  Laman’s  theorem  using  the  property  that  for  a  rigid 
graph  G  =  (V,E)  there  must  exist  an  If  C  E,  such  that 
G  =  ( V,  If )  is  minimally  rigid.  The  example  depicted  in 
Fig.  1(a)  shows  however,  that  this  is  not  true  in  general  for 
(l,p) -rigidity  when  p  >  1. 

To  analyze  the  graph  G  =  (V.  E),  depicted  in  Fig.  1(a) 
observe  that  every  vertex  in  the  first  and  last  rows  of  this 
nine  vertex  graph  has  an  edge  to  every  vertex  in  the  second 
row.  Thus  in  all  there  are  18  edges  in  this  graph.  Obviously, 
as 

18  >2x9-4  + 2, 

this  graph  cannot  be  minimally  (l,2)-rigid.  It  is  also  clear 
that  there  is  no  E'  C  E  such  that  (V,  E')  is  minimally  (1,  2)- 
rigid.  This  is  so  as  were  such  an  E'  to  exist  it,  If\lf  must 
have  precisely  two  elements  in  it.  It  is  clear  that  the  removal 
of  any  edge  from  G  leaves  at  least  one  vertex  with  degree 


equal  to  two.  Then  no  matter  what  further  edge  is  removed, 
the  resulting  subgraph  cannot  be  (1,  2)-rigid. 

We  now  assert  that  G  is  in  fact  (1,  2)-rigid.  To  see  this  first 
observe  that  the  graph  induced  by  the  vertices  {1,  •  •  •  ,6}  and 
that  induced  by  {4,  •  •  •  ,  9}  are  each  minimally  rigid.  Indeed 
the  graph  in  Fig.  1(b)  is  minimally  rigid.  The  graph  induced 
by  {1,  ■  ■  •  ,6}  can  be  built  from  this  by  two  successive 
edge-splitting  operations.  The  minimal  rigidity  of  the  graph 
induced  by  {4,  •  •  •  ,9}  similarly  follows. 

Also  note  that  there  is  a  complete  symmetry  between  the 
edges,  in  that  if  we  show  that  the  graph  obtained  by  removing 
the  edge  ei4  is  rigid,  this  implies  that  a  graph  obtained  by 
removing  any  single  edge  is  rigid.  It  is  easy  to  see  that  if 
edge  ei4  is  removed,  the  graph  remains  rigid.  Thus  indeed 
the  graph  in  Fig.  1  is  (l,2)-rigid. 

The  above  observation  shows  that  a  (l,p)-rigid  graph 
may  not  contain  a  minimally  (l,p)-rigid  graph.  However 
in  addition,  the  definition  of  minimally  (1,  p)-rigid  graph  is 
restricted  to  p  <  2  only.  Seeking  an  alternative  definition 
remains  open  problem. 

Lemma  5.1:  For  every  integer  p  >  2,  there  is  no  mini¬ 
mally  (l,p) -rigid  graph  satisfying  Definition  5.1. 

VI.  Redundant  Edge  Rigidity:  The  Lovasz-Yemeni 
Approach 

In  this  section,  we  characterize  (l,p) -rigidity  using  the 
Lovasz-Yemeni  approach.  As  in  section  V  we  will  state  the 
result  attending  first  to  minimally  (l,2)-rigid  graphs. 

Lemma  6.1:  Let  G  =  (V,  E)  be  a  minimally  (l,2)-rigid 
graph.  Then  r2(G)  =  2\V\  —  2,  where  r2 (G)  is  defined  in 
Definition  2.3. 

We  next  state  a  result  that  shows  that  rp(G )  >  2\ V\  —  4+p 
is  a  sufficient  condition  for  (l,p) -rigidity. 

Theorem  6.1:  Let  G  =  ( V,  E )  be  a  graph  with  rp(G)  > 
2\V\  —4  +  p.  Then  G  is  (l,p)-rigid. 

Combining  these  two  results  we  immediately  obtain  the 
following  necessary  and  sufficient  condition  for  minimal 
(1,2) -rigidity. 

Theorem  6.2:  A  graph  G  =  ( V,E )  is  minimally  (1,2)- 
rigid  iff  r2(G)  >  \E\  =  2\V\  -  2. 

The  question  remains  whether  rp(G)  =  2\V\  +  p  —  4  is  a 
necessary  condition  for  nonminimal  (l,p) -rigidity  (assuming 
that  \E\  2|V|  +  p  —  4).  The  graph  G  =  (V)  E)  in  Fig.  2 

serves  as  a  counter  example. 

Consider  the  three  edge  partitions:  Ei  =  Eq({  1,  2, 4,  5}), 
E2  =  £g({3,7,4,8})  and  E3  =  EG({ 8, 9,  6,  5}).  P  = 
{(VciEi),  Ei)}^^  is  a  2-AD  for  G.  In  this  case: 

r(P)  =  3  x  (8  -  3)  =  15  <  2  x  9  -  2  =  16. 

The  graph  is  clearly  rigid  after  removing  any  single  edge 
other  than  from  the  set  e45,  e48,  e3s-  Now  without  sacrificing 
generality  remove  the  edge  645.  Observe  it  can  be  recon¬ 
structed  by  an  edge-splitting  operation  on  (V  \  {2},  Eo(V  \ 
{2})).  Thus  this  graph  is  (l,2)-rigid. 

To  this  point  we  have  shown  that  both  Lovasz-Yemeni 
and  Laman  type  necessary  and  sufficient  conditions  exist 
for  minimal  (l,p) -rigidity  (p  =  2),  the  Lovasz-Yemeni 
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(a) 


(b) 


Fig.  1.  (a)  A  (1,  2)-rigid  graph  that  does  not  contain  a  minimally  (1,  2)-rigid  subgraph;  (b)  A  graph  used  in  proving  claim  for  graph  of  (a). 


type  sufficient  condition  also  characterize  nonminimal  ( 1 .  p)- 
rigidity  for  all  positive  integer  p. 

Recall  however,  that  the  direct  subject  of  investigation 
in  this  paper  is  redundant  global  rigidity,  which  beyond 
redundant  rigidity  also  requires  redundant  connectivity.  It 
is  instructive  to  note  that  the  example  of  Fig.  1  is  (3,1)- 
connected,  though  that  of  Fig.  2  is  not.  It  is  also  possible  to 
show  that  the  graph  in  Fig.  1  does  satisfy  the  corresponding 
Lovasz- Yemeni  sufficient  condition  for  (1,  2)-rigidity.  Indeed 
the  next  section  demonstrates  that  a  (l,p)-rigid  graph  that 
has  an  additional  redundant  connectivity  property  must  have 
rp(G)  >2\V\-  4  +  p. 

Clearly,  having  an  efficient  algorithm  for  computing  rp  is 
important.  While  we  fail  to  include  such  an  algorithm  in  this 
paper,  we  would  like  to  conjecture  that  it  exists:  Lovasz  and 
Yemeni  [11]  claimed  that  ri(G)  is  easy  to  compute,  and  by 
definition  finding  rp(G)  ,  p  >  1  involves  computation  of 
sums  of  vertex  cardinalities  involving  a  set  of  subgraphs  that 
is  also  a  subset  of  those  needed  to  compute  ri(G). 


VII.  Lovasz-Yemeni  type  condition  for 

REDUNDANT  RIGIDITY  WITH  REDUNDANT  CONNECTIVITY 

This  section  considers  levels  of  connectivity  needed  for  a 
(l,p)-rigid  graph  to  satisfy  rp(G)  >  2\V\  —  4  +  p.  At  the 
end  of  this  section  we  will  tie  the  results  here  to  redundant 
global  rigidity. 

We  have  the  following  result  connecting  Yemeni-Lovasz 
type  conditions  to  (l,p)-global  rigidity  for  p  <  4. 

Theorem  7.1:  Suppose,  for  integer  p  with  2  <  p  <  1, 
G  =  ( V,E )  is  (3 ,p  —  l)-connected  and  (l,p)-rigid.  Then 
tp(G)  >2\V\-  4  +  p. 

There  remains  the  question  whether  the  result  of  Theorem 
7.1  will  hold  for  p  >  4.  We  now  present  a  counterexample 
for  p  =  5. 

Example  7.1:  The  graph  in  question,  G  =  (V,  E)  has 
|Vj  =  33,  and  has  a  vertex  cover  V\ ,  •  •  •  ,  \f,  such  that  each 
|  |  =  7,  each  subgraph  (Vi,  EciVi))  is  a  K-j  graph,  and 

the  sets  Ei  =  Eg  (Vi)  consitute  a  partition  of  E.  Further, 
defining 

Xi  =  V  n  (u j^Vj) ,  (Vii. 2) 

one  has  Xx  =  {1,2,3},  W2  =  {1,4,5},  X3  =  {2,6,7}, 
X4  =  {3, 8,  9},  X5  =  {4, 6, 8}  and  X6  =  {5,  7, 9}.  In  other 
words  Vi  comprises  X,  together  with  four  other  vertices  that 
are  not  in  any  other  Vi.  Thus  there  are  24  vertices  each 
belonging  to  just  one  Vi,  and  9  which  are  in  exactly  two. 

To  see  that  this  choice  of  Xi  is  consistent  with  the  fact 
that  the  Et  partition  E,  observe  for  all  i  j, 

<  1.  (VII. 3) 

Further  vertices  in  each  V,  connect  to  the  others  through  the 
elements  of  X, .  Thus,  indeed  this  defintion  is  consistent  with 
the  requirement  that  the  Ei  are  disjoint. 

In  effect  then  G  comprises  six  Kj  subgraphs,  connected 
through  the  vertices  in  the  X,.  Each  Vj  has  three  vertices 
through  which  it  connects  to  other  Vj ,  there  being  one  vertex 
in  common  with  each  of  three  different  Vj. 
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One  can  show  that  G  is  both  (3, 4)-connected  and  (1,5)- 
rigid.  Since  each  subgraph  (V),  Eq(Vi))  is  K-j,  it  also  follows 
that  Vi  =  Vc(Ei).  Thus  as  the  £)  partition  E,  we  have  that 

P  =  {(V,EG(V))}t1 

is  a  6-AD  of  G.  Hence, 

r5(G )  <  r{P)  =  6(14  -  3)  =  66  =  2\V\  <  2\V\  -4  +  5, 

and  Theorem  7.1  is  violated  for  p  =  5. 

Nonetheless  we  now  show  that  for  p  >  4,  a  stronger 
redundant  connectivity  condition  suffices  for  rp(G)  >  2|  U|  — 
4  +  p  to  hold  given  (l,p) -rigidity. 

Theorem  7.2:  Suppose,  for  integer  p  >  3,  G  =  ( V.  E)  is 
(4 ,p—  2)-connected  and  (l,p)-rigid.  Then  rp(G)  >  2\V\  — 
4  +  p. 

As  global  rigidity  is  equivalent  to  the  combined  properties 
of  (3, 1  (-connectivity  and  (1,  2)-rigidity,  in  view  of  the  two 
theorems  in  this  section  we  have  the  following  key  result. 

Theorem  7.3:  A  graph  G  =  (V,  E)  is  (l,p)-globally  rigid 
if  it  is  (3,p)-connected  and  rp+i(G)  >  2\V\  —  3  +  p.  For 
1  <  p  <  3  it  is  (l,p) -globally  rigid  only  if  rp+i(G)  > 
2\V\  —3  +  p.  For  p  >  4  a  (4,p  —  2)-connected  G  =  (V,  E) 
is  (l,p) -globally  rigid  only  if  rp+1(G)  >  2\V\  —3  +p. 

Given  the  relation  between  global  rigidity,  anchor  count 
and  localizability,  there  is  an  obvious  extension  of  this  to 
a  corresponding  characterization  of  redundantly  localizable 
two-dimensional  sensor  networks. 

We  comment  that  in  a  sensor  network  in  which  loss  of 
nodes  is  contemplated,  it  may  be  that  loss  of  anchor  nodes 
can  be  ruled  out  on  reliability  or  other  grounds.  What  is 
needed  for  redundant  localizability  in  addition  to  redundant 
global  rigidity  is  that  after  loss  of  up  to  q-1  nodes,  at  least 
three  noncollinear  anchor  nodes  remain.  The  issue  of  which 
nodes  fail  does  not  enter  the  picture  when  the  redundancy  is 
all  with  respect  to  edge  loss  of  course. 

VIII.  Conclusions  and  Future  Work 

This  work  presents  preliminary  results  on  edge-redundant 
global  rigidity,  which  is  essential  to  the  problem  of  seeking  a 
solution  to  guaranteing  network  localizability  in  the  event  of 
link  losses.  The  particular  notion  that  is  studied  in  detail  is 
(1,7?) -rigidity,  for  which  two  different  types  of  characteriza¬ 
tions  were  obtained:  Laman  type  and  Lovasz-Yemeni  type. 
For  the  latter,  further  connection  was  established  between 
redundant  edge  rigidity  and  redundant  connectivity.  We  also 
show  that  a  seemingly  obvious  definition  of  minimal  (l,p) 
rigidity  is  meaningless  for  p  >  2. 

As  discussed  in  the  beginning  of  this  paper,  it  is  of  great 
importance  to  also  study  redundant  vertex  rigidity,  i.e.  (q,  1)- 
rigidity.  Some  discussion  is  provided  in  the  text  but  its 
characterization  remains  largely  open.  Also,  one  will  need 
computationally  efficient  algorithms  to  check  for  satisfaction 
of  these  conditions  (e.g.  computing  rp(G))  for  redundant 
localizability  of  a  network.  Moreover,  operations  are  required 
for  one  to  construct,  augment,  merge  and  split  such  networks, 
while  ensuring  the  level  of  redundant  localizability  remain 
unchanged.  Results  along  the  lines  of  [1]  providing  simple 


sufficient  conditions  for  redundant  global  rigidity,  largely  in 
terms  of  local  properties  of  graphs,  would  also  be  welcome. 
At  some  point  too,  random  geometric  graphs  should  be 
investigated. 
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Redundant  Localizability  of  Sensor  Networks  * 


Changbin  Yu,  Soura  Dasgupta,  Brian  D.O.  Anderson 


Abstract 

The  ability  to  localize  a  sensor  network  is  important  for  its  deployment.  A  theoretical  result  exists  defining  necessary  and  sufficient 
conditions  for  network  unique  localizability  (for  inter-sensor  range-based  localization);  it  has  its  roots  in  Graph  Rigidity  Theory  where 
sensors  and  links/measurements  are  modelled  as  vertices  and  edges  of  a  graph,  respectively.  However,  critical  missions  do  require  a  level  of 
robustness  for  localizability,  ensuring  that  localizability  is  retained  in  the  event  of  link  (edge)  losses  and/or  sensor  (vertex)  losses.  This  work 
characterizes  this  robustness  through  a  novel  notion  of  redundant  localizability,  which  is  backed  by  redundant  rigidity.  Analogously  to  two 
well-known  types  of  result  for  rigidity  characterization,  similar  results  are  developed  for  edge  redundant  rigidity;  they  are  supplemented  by 
rather  fewer  results  dealing  with  vertex  redundant  rigidity.  These  results  form  a  foundation  for  any  further  study  of  redundant  localizability. 


Key  words: 

Rigidity  Theory,  Network  Localizability,  Sensor  Network  Localization 


1  Introduction 

Obtaining  location  information  is  a  fundamental  task  in  a  sensor  network,  since  otherwise  the  sensed  data  will  become  much 
less  valuable.  The  location  of  every  sensor  node,  if  not  already  known  from  the  deployment  of  the  network  or  directly  from 
GPS,  can  only  be  determined  from  a  process  based  on  measurements,  the  network  structure  and  partially  known  location 
information.  This  process  is  referred  to  as  localization,  and  the  network  property  that  governs  the  feasibility  of  localizing  the 
entire  network  given  the  measurement  is  called  localizability. 

Most  localization  algorithms  are  based  on  range-measurements,  or  measurements  that  can  be  converted  to  inter-node  distance 
measurements. 

Aspens  et  al  [2]  formally  prove  that  for  a  2D  localization  problem,  a  necessary  and  sufficient  condition  for  localizability  given 
inter-node  distance  measurements  is  that  a  network  graph  must  be  globally  rigid  (the  concept  is  reviewed  below)  and  at  least 
three  noncollinear  sensors  must  have  known  location  information  (hence  they  are  anchors). 

Having  just  global  rigidity  may  be  unrealistic  in  practical  scenarios,  as  not  only  can  the  localization  algorithms  demand  very 
high  computational  complexity,  but  also  because  the  global  rigidity  property  (hence  localizability)  can  easily  be  lost  if  some 
node  or  measurement  becomes  unavailable,  manifesting  some  type  of  node/link  failure  in  the  network. 

The  first  problem  has  attracted  great  interest  among  researchers.  Anderson  et  al  [1]  discussed  various  graphical  properties  of 
easily  localizable  sensor  networks.  A  main  result  is  that  by  doubling  or  tripling  the  sensing  radius,  the  new  network  graph 
acquires  special  properties  that  provide  guarantees  on  the  computational  complexity,  sometimes  linear  in  the  number  of  sen¬ 
sors.  In  [14],  limitations  of  classical  trilateration  algorithms  are  analyzed,  proving  their  insufficiency  in  even  recognizing  the 
localizability  of  a  graph.  A  novel  localization  method  generalizing  trilateration  is  proposed  based  on  aggregated  knowledge  of 


*  A  preliminary  version  of  this  paper  has  been  published  in  the  Proceedings  of  the  49th  IEEE  Conference  on  Decision  and  Control  (CDC 
2010).  entitled  "Network  Localizability  with  Link  or  Node  Losses’.  This  work  was  supported  by  USAF-AOARD- 10-4 102.  C.  Yu  is  supported 
by  the  Australian  Research  Council  (ARC)  through  a  Queen  Elizabeth  II  Fellowship  under  DP- 1 10100538  and  the  Overseas”  Expert  Program 
of  Shandong  Province.  S.  Dasgupta  is  supported  by  US  NSF  grants  ECS-0622017,  CCF-072902.  and  CCF-0830747.  B.D.O.  Anderson  is 
supported  by  the  ARC  and  National  ICT  Australia  (NICTA). 
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the  subnetwork  formed  by  all  1-hop  neighbors  of  each  node  and  the  node  itself;  roughly  speaking,  the  induced  subgraph  of  the 
neighbors  and  the  node  is  actually  a  wheel  graph  that  is  globally  rigid. 

The  second  problem,  viz.  ensuring  tolerance  of  node  or  link  failure,  has  been  rarely  addressed.  Recent  study  of  redundant 
rigidity  [13]  has  started  to  shed  light  on  a  new  direction,  using  a  graph  theoretical  approach.  It  raises  the  question,  pursued  in 
depth  in  this  paper:  what  are  the  conditions  for  preserving  localizability  in  the  event  of  loss  of  up  to  p  link  length  measurements 
and  q  nodes  ( together  with  all  their  associated  distance  measurements )? 

This  work  studies  the  level  of  redundancy  that  can  be  built  into  localizability,  such  that,  the  network  is  guaranteed  to  be 
localizable  when  the  loss  of  nodes  and/or  links  is  allowed,  up  to  a  certain  maximum  number  in  each  case.  As  the  result  is  more 
of  a  fundamental  characterization  than  a  practical  algorithm,  the  typical  problems  in  practical  localization,  such  as  dealing 
with  noisy  measurements,  characterization  of  RMS  error  and  bias  in  position  estimates,  error  propagation,  and  computational 
complexity,  are  not  considered  and  remain  for  the  future. 

The  rest  of  this  paper  is  organized  as  follows.  The  preliminaries,  especially  relevant  results  of  graph  rigidity  theory,  are  intro¬ 
duced  in  Section  2.  The  connections  between  redundant  rigidity,  redundant  connectivity,  global  rigidity  and  redundant  localiz¬ 
ability  are  established  in  Section  3.  Based  on  these.  Section  4  discusses  the  relationship  between  vertex  and  edge  redundancy. 
Two  characterizations  of  minimal  (1,^) -rigidity,  a  concept  associated  with  the  loss  of  links,  are  then  given  in  Section  5  and 
Section  6,  respectively.  Section  7  establishes  the  relationship  between  redundant  connectivity  and  (l,p) -rigidity  on  the  one 
hand  and  redundant  global  rigidity  and  localizability  on  the  other.  Finally,  the  conclusions  and  future  work  are  presented. 

2  Preliminaries 

We  will  model  a  network  as  an  undirected  graph  G  =  (V,E)  where  the  nodes  are  elements  of  the  vertex  set  V,  and  an  edge 
ejj  €  E  if  the  distance  between  nodes  i  and  j  is  available.  In  the  sequel  for  X  C  V,  ic(X)  will  be  the  number  of  edges  in  the 
graph  induced  by  the  vertex  set  X  in  G.  Further,  with  V,  C  V,  the  edge  set  in  the  graph  induced  by  the  vertex  set  V,  in  G  will 
be  denoted  by  EoiVi).  Similarly,  with  £)■  C  E,  the  vertex  set  in  the  graph  induced  by  the  edge  set  E,  in  G  will  be  denoted  by 
Va(Ei). 


As  noted  in  the  introduction  a  result  due  to  [2]  proves  that  a  two-dimensional  network  G  =  (V,E)  is  localizable  iff  it  is  globally 
rigid  and  has  at  least  three  noncollinear  anchor  nodes.  Thus  in  section  2.1  we  provide  some  preliminary  information  about 
rigidity  and  global  rigidity. 

In  subsection  2.2  we  provide  two  alternative  though,  equivalent  characterizations  of  rigidity.  The  first,  a  celebrated  result  due 
to  Laman,  [10]  involves  edge  counts  on  subgraphs  induced  by  components  of  vertex  partitions.  The  second,  primarily  due  to 
Lovasz  and  Yemeni  [11]  involves  vertex  counts  on  subgraphs  induced  by  components  of  edge  partitions. 

Subsequently,  in  Section  3  we  provide  pertinent  redundancy  definitions  and  an  important  linkage  between  redundant  rigidity, 
redundant  connectivity  and  global  rigidity,  that  sets  up  the  remainder  of  the  paper. 

2.1  Rigidity  and  Global  Rigidity 

In  this  subsection,  we  recall  briefly  the  notions  and  some  properties  of  (minimal)  rigidity  and  global  rigidity.  Our  description 
will  be  largely  based  on  graphs  that  model  the  network  under  consideration.  Formal  definitions  of  graph  rigidity  and  global 
rigidity,  which  involve  the  theory  of  graph  representations,  can  be  found  in,  e.g.,  [4,5, 12], 

A  graph  is  rigid  if  the  only  edge-length  preserving  smooth  motions  in  a  generic  network  1  modelled  by  the  graph  are  translation 
and  rotation,  thus  only  result  in  congruent  graphs.  A  graph  is  minimally  rigid  if  it  is  rigid  and  no  single  edge  can  be  removed 
without  losing  rigidity. 

Figure  1  shows  several  examples  of  two  dimensional  graphs,  two  of  which  are  rigid  and  one  of  which  is  not  rigid.  In  a  network 
corresponding  to  non-rigid  graph  part  of  the  network  can  ‘flex’  or  move  while  preserving  the  edge-lengths,  while  the  rest  of 
the  network  stays  still.  The  notion  of  rigidity  conforms  to  one’s  normal  intuition. 


1  The  graph  theory  literature  generally  uses  a  different  term  than  network  in  discussing  rigidity,  the  word  framework  being  commonly 
employed.  Because  of  this  paper’s  connection  with  sensor  networks,  we  retain  the  word  network  below. 
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Fig.  2.  Illustration  of  flip-ambiguity  from  (a)  to  (b);  and  flex-ambiguity  from  (c)  to  (d)  .  Lengths  of  all  corresponding  edges  are  the  same. 

Close  inspection  of  the  rigidity  definition  reveals  that  the  consideration  is  made  for  smooth  motions.  Under  discontinuous 
transformation  of  a  network  corresponding  to  a  rigid  graph,  it  is  possible  to  have  non-congruent  network  that  preserves  the 
edge-lengths  due  to  flip  and/or  flex  ambiguity.  This  is  illustrated  in  Fig.  2. 


To  eliminate  these  ambiguities  and  to  uniquely  determine  the  relative  positions  of  any  vertex  given  the  set  of  distance  measure¬ 
ments  (corresponding  to  the  edge-length  set),  a  stronger  definition  is  desirable.  A  graph  is  globally  rigid  if  any  two  realizations 
of  a  network  (with  prescribed  edge  lengths  and  modelled  by  the  graph)  of  the  one  edge-length  set  are  congruent,  i.e.,  differ  at 
most  by  translation,  rotation  or  reflection.  We  refer  the  reader  to  [4, 8]  for  source  material  on  global  rigidity. 


2.2  Rigidity  Characterization 


We  provide  now  two  different  sets  of  necessary  and  sufficient  conditions  for  a  graph  to  be  rigid.  The  first  due  to  Laman,  [10] 
involves  edge  counts  on  graphs  induced  by  vertex  subsets.  We  present  this  theorem  below. 

Theorem  2.1  [Laman’ s  Theorem]  A  graph  G=  (V,E)  is  minimally  rigid  iff\E\  =  2|V  |  —  3  and  for  all  X  C  V,  /^(A)  ^  2|A|  —3. 
The  graph  is  rigid  iff  there  is  an  E'  C  E,  such  that  G'  =  {V,E')  is  minimally  rigid. 

In  the  sequel,  we  will  also  need  the  following  result  that  presents  Henneberg  Operations  for  growing  a  rigid  graph  [7], 

Theorem  2.2  [Vertex  Addition]  Consider  a  graph  G  =  ( V,E ),  a  node  kf  V  and  two  edges  e&  and  ejk,  with  i  and  j  elements 
ofV.  Then  G  is  (minimally)  rigid  iff  (V(Jk,E(J{eik,ejk}  is  (minimally)  rigid. 

Theorem  2.3  [Edge  Splitting ]  Consider  a  graph  G  =  ( V,E ),  a  node  t  (£V  and  three  edges  efi  etj  and  e,k,  with  i  j  and  k 
elements  ofV  and  e,j  element  of  E.  Then  G  is  ( minimally )  rigid  iff  (V  \Jt,E\eij{j{eti,etj,etk}  is  (minimally)  rigid. 

There  are  efficient  algorithms  for  checking  the  edge  count  condition  underlying  Laman’s  theorem  [9]. 

The  second  necessary  and  sufficient  condition  due  to  Yemeni  and  Lovasz  [11],  involves  vertex  counts  on  graphs  induced  by 
edge  subsets.  Though  less  succinctly  phrased,  it  may  be  easier  to  check  than  Laman’s  edge  count  result,  and  references  to 
algorithms  for  checking  the  condition  are  provided  in  [11].  Jackson  and  Jordan  [8]  provide  an  elegant  interpretation  of  this 
result  using  the  theory  of  matroids.  However,  even  though  we  rely  heavily  on  results  from  [8],  to  avoid  the  need  for  a  primer 
on  matroids,  we  eschew  matroid  terminology,  and  use  our  own  somewhat  nonstandard  but  equivalent  terminology. 
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Definition  2.1  Consider  a  graph  G=(V,E)  and  Ej  C  E.  A  set  of  subgraphs  P  =  {G,  =  (Vc(Ej)  ,Ej){  is  termed  an  m-Admissible 
Decomposition  (AD)  of  G  if  the  following  hold. 

(i)  \Ei\  >  0. 

(ii)  U  Ei  =  E. 

(iii)  P  has  at  least  m  elements. 

If  m  =  1,  then  P  is  simply  called  an  AD  of  G. 

Next  we  define  the  index  of  an  w-AD  that  reflects  an  associated  vertex  count. 

Definition  2.2  Consider  a  graph  G  =  ( V,  E )  and  Ej  C  E.  We  call  r(P)  the  index  of  an  m-AD  P  =  {Gj  =  {VcfEf.Ej)}^  ,  of  G 
where  k  ^  m  is  defined  as 

r(P)  =  t  (2|VG(£,-)|-3).  (2.1) 

i=  1 


To  state  the  result  we  need  one  more  definition. 


Definition  2.3  Consider  a  graph  G  =  (V,E)  and  Ej  C  E.  For  a  given  m,  an  m-AD  P  =  {Gj  =  (Va(Ej),  £,■)}£_  i  of  G,  k  f  m  is 
termed  m-minimizing  if  r(P)  is  the  smallest  among  the  indices  of  all  possible  m-AD’s  of  G.  Such  an  index  will  be  denoted  as 
rm{G). 

Now  we  provide  the  promised  second  characterization  of  rigidity  that  follows  from  results  in  [8],  [11]  and  [6], 

Theorem  2.4  A  graph  G  is  rigid  iff  r\  (G)  =  2\V\  —  3.  Further,  the  edge  sets  Ej  underlying  any  minimizing  admissible  decom¬ 
position  are  disjoint. 

Observe  one  cannot  say  that  r\ ( G )  =  2\V  \  —  3  implies  minimal  rigidity.  Indeed  consider  the  graph  G  =  K4.  Then  with  P  =  {G}, 
r(G)  =  2\V\  —  3,  even  though  G  is  not  minimally  rigid. 

We  conclude  this  section  with  an  associated  definition  from  [8]  and  two  results,  one  from  [8],  that  will  assist  us  in  later 
development. 

Definition  2.4  Consider  a  graph  G  =  (  V.  E).  Suppose  S  is  a  non-empty  subset  ofE,  and  H  is  the  subgraph  induced  by  the  edge 
set  S.  Then  S  is  an  independent  subset  of  E  if  in(X)  ^  2\X\  —  3  for  all  X  C  Vc(S),  as  long  as  |X|  ^  2.  The  null  set  is  also  an 
independent  subset  ofE. 

The  following  result  is  a  direct  consequence  of  Laman’s  theorem. 

Lemma  2.1  Suppose  a  graph  G  =  ( V,E )  is  minimally  rigid.  Then  for  any  S  C  E,  S  is  a  maximally  independent  subset  of  S  in 
the  subgraph  of  G,  induced  by  S. 


The  final  result  in  this  section  is  a  translation  of  Lemma  2.4  from  [8]. 

Lemma  2.2  Consider  a  graph  G  =  ( V,E )  with  |£|  ^  1.  Suppose  S  C  E  is  a  maximally  independent  subset  ofE.  Then  n  (G)  = 

14 


3  Redundant  Global  Rigidity 

Generally,  there  are  two  types  of  redundancy:  those  involving  loss  of  edges,  and  those  involving  the  loss  of  vertices.  As 
a  generalization  we  work  here  with  mixed  redundancy.  We  begin  with  definitions  of  mixed  redundant  rigidity  and  mixed 
redundant  connectivity. 

Definition  3.1  A  graph  G  =  (V.E)  is  (q.  p)-rigid  if  for  all  0  f  l  fiq  -  1  and  0  f  kfip  —  \  ,  the  induced  subgraph  obtained  by 
removing  any  l  vertices  and  k  edges  is  rigid. 
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In  particular,  a  ( q ,  l)-rigid  graph  is  what  has  been  defined  in  [13]  as  q-vertex  rigid ,  or  simply  r/-rigid .  Similarly  a  (l,p)-rigid 
graph  is  known  as  p-edge  rigid.  There  is  also  a  comparable  definition  for  redundant  connectivity. 

Definition  3.2  A  graph  G  =  (V,E)  is  (q,  p)-connected  if  for  all  0  fl  fq  1  and  0  f  k  f  p  —  the  induced  subgraph  obtained 
by  removing  any  l  vertices  and  k  edges  is  connected. 

These  definitions  bring  us  to  our  first  new  result,  aimed  at  tightening  them.  Specifically,  we  show  below  that  these  defini¬ 
tions  are  stronger  than  is  needed.  Indeed  it  is  clear  from  Laman’s  theorem  that  if  a  graph  is  not  rigid  then  the  removal  of  a 
further  edge  cannot  make  it  rigid.  The  same  fact  applies  to  connectivity.  Thus  it  is  possible  to  alter  the  definition  of  (l,p)- 
rigidity/connectivity  to  simply  require  that  the  graph  remain  rigid/connected  after  the  removal  of  any  p  —  I  edges,  as  opposed 
to  up  to  p-1  edges  as  required  by  the  foregoing  definitions. 

On  the  other  hand,  it  is  entirely  possible  for  a  nonrigid  or  a  disconnected  graph  to  gain  rigidity  or  connectivity  after  losing 
vertices.  Thus  consider  a  rigid  graph  (V.E).  Suppose  v  (f-_  V  and  let  e  be  an  edge  connecting  v  to  a  vertex  in  V.  Then  it  is  easy 
to  see  that  the  graph  (VU{v}i£U{e})  is  not  rigid. 

Similarly  if  the  graph  (V,E)  is  connected,  the  graph  (V  |J{v},£)  is  not  connected  as  the  vertex  v  is  isolated  in  this  augmented 
graph. 

We  now  show  that  in  fact,  barring  small  graphs,  even  with  ^-vertex  rigidity  it  suffices  to  check  if  rigidity  is  retained  after 
deleting  any  q  1 -vertices. 

Theorem  3.1  A  graph  G  =  (V,E),  with  |V|  >  q  +  1  is  ( q,p)-rigid  iff  it  is  rigid  after  removing  any  q  —  1  vertices  and  p—\ 
edges. 

Proof:  It  is  clear  that  a  graph  that  is  is  (q,p)- rigid  must  be  rigid  after  removing  any  q  —  1  vertices  and  p  —  1  edges.  Now 
suppose  the  graph  is  rigid  after  removing  any  q—  1  vertices  and  p—  1  edges.  To  establish  a  contradiction,  suppose  the  graph 
is  not  (q,p)- rigid.  Observe  if  a  graph  is  not  rigid,  then  no  graph  with  the  same  vertices  and  a  subset  of  edges  can  be  rigid. 
Thus  there  is  a  set  of  vertices  V' ,  with  0  ^  \V'\  =  /  <  q  —  1  and  a  set  of  p  —  1  edges  Ep,  such  that  with  V"  =  V  \  V'  the  graph 
G"  =  (V",EG(V")\Ep)  is  not  rigid  but  for  some  j  £  V"  the  graph  G"  =  (V"  \  {j},Eo(V"  \  {j})\Ep)  is  rigid.  Then  because 
of  Theorem  2.2,  j  must  have  at  most  one  edge  in  G"  =  (V" ,Eq(V")  \Ep).  Now  there  are  at  least  q+1  —  l  >2  vertices  in 
V"\{j}.  For  any  {k\,-  ■  ■  ,k„-\-i}  C  V"  \  {j},  consider  the  set  V'"  =  V"  \{k\.--  ■  ,kq-\-i}.  Observe  \V"'\  =  |V|  —  q+  T  Thus 
by  assumption  (V'" ,Eq(V"')  \Ep )  is  rigid.  At  the  same  time  V'"  has  at  least  three  elements,  contains  j,  and  yet  j  has  at  most 
one  edge  in  V'" ,  establishing  a  contradiction.  ■ 

A  similar  result  obtains  for  redundant  connectivity. 

Theorem  3.2  A  graph  G  =  {V.E),  with  |V|  >  q  is  ( q,p)-connected  iff  it  is  connected  after  removing  any  q  —  1  vertices  and 
p—  1  edges. 

Proof:  It  is  clear  that  a  graph  that  is  (q,p)  -connected  must  be  connected  after  removing  any  q—  1  vertices  and  p—  1  edges. 
Now  suppose  the  graph  is  connected  after  removing  any  q  —  1  vertices  and  p  —  1  edges.  To  establish  a  contradiction,  suppose 
the  graph  is  not  (q,p) -connected.  Observe  if  a  graph  is  not  connected,  then  no  graph  with  the  same  vertices  and  a  subset  of 
edges  can  be  connected.  Thus  there  is  a  set  of  vertices  V',  with  0  ^  |V'|  =  /  <  g  —  1  and  a  set  of  p  1  edges  Ep,  such  that  with 
V"  =  V\V'  the  graph  G"  =  (V"  ,Eq(V")\Ep)  is  not  connected  but  for  some  j  G  V"  the  graph  G"  =  {V"\{j}  ,Eq{  V"  \{j})\Ep) 
is  connected.  Then  j  must  be  isolated  in  G"  =  (V" ,Eq(V")  \  Ep).  Now  there  are  at  least  q  +  1  —  l  >  1  vertices  in  V"  \  {y}. 
For  any  {k\G  ■  ■  ,^9-i-/}  subset  of  V"  \  {j},  consider  the  set  V'"  =  V"  \  {&!•,•  •  •  Observe  \V'"\  =  |V|  —  q+  1-  Thus  by 

assumption  (V"' ,Eq(V'")\Ep)  is  connected.  At  the  same  time  V'"  has  at  least  two  elements,  contains  j,  and  yet  j  is  isolated 
in  V'" ,  establishing  a  contradiction.  ■ 


We  now  recount  a  result  from  [8]  that  ties  connectivity  and  rigidity  to  global  rigidity. 

Theorem  3.3  A  graph  G  =  (V.E)  is  globally  rigid  iff  it  is  (1,2  )-rigid  and  (3, 1  )-connected. 

We  now  define  redundant  global  rigidity,  which  as  noted  in  section  2  is  equivalent  to  redundant  localizability  given  three  or 
more  noncollinear  anchors. 
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Definition  3.3  A  graph  G  =  ( V.  E )  is  (q,p)- globally  rigid  if  it  is  globally  rigid  and  for  all  ()  f  l  f  q  I  and  ()  fk  +  p  —  the 
induced  subgraph  obtained  by  removing  any  l  vertices  and  k  edges  is  globally  rigid. 

In  view  of  theorems  3. 1-3.3  one  has  the  following  result. 

Theorem  3.4  A  graph  G  =  (V,  E),  with  |V|  q  +  1  is  [q .  p  )  -globally  rigid  iff  it  is  rigid  after  removing  any  q  1  vertices  and 
p  edges  and  is  connected  after  removing  any  q+  1  vertices  and  p—  1  edges. 

Thus  redundant  global  rigidity  has  two  components:  Redundant  rigidity  and  redundant  connectivity. 

Finally,  we  come  to  redundant  localizability. 

Definition  3.4  A  sensor  network  with  a  given  set  of  inter-node  distance  measurements  is  ( q,p )  redundantly  localizable  if  it 
remains  localizable  after  removing  any  q  —  1  nodes  and  any  p  —  1  edges. 

Since  a  network  is  localizable  if  and  only  if  it  is  globally  rigid  and  has  three  or  more  noncollinear  anchors,  the  previous  theorem 
immediately  yields: 

Theorem  3.5  A  two-dimensional  sensor  network  with  a  given  set  of  anchors  and  inter-node  distance  measurements  and  at 
least  q  +  2  nodes  is  ( q,p )  redundantly  localizable  if  and  only  if  its  associated  graph  is  rigid  after  removing  any  q—  1  vertices 
and  p  edges,  connected  after  removing  any  q  +  2  vertices  and  p—  1  edges,  and  the  network  retains  three  or  more  noncollinear 
anchors  after  removing  any  q—  1  vertices. 

The  rest  of  the  paper  is  largely  concerned  with  the  characterization  of  the  redundant  rigidity. 


4  Connections  between  vertex  and  edge  redundancy 

In  this  section  we  establish  relationships  between  vertex  and  edge  redundancy.  The  uniform  message  is  that  vertex  redundancy 
conditions  are  stronger  than  their  edge  counterparts. 

First  we  present  a  result  concerning  redundant  rigidity. 

Theorem  4.1  A  graph  G  =  ( V,E ),  with  |V|  ^  p  +  q  +  3  that  is  ( q  +  s,p)-rigid for  some  s  >  0,  is  ( q,p  +  s)-rigid . 

Proof:  Consider  any  graph  G'  =  (V.  E1)  where  E'  is  a  subset  of  E  and  \E'\  =  \E  —  p  +  1 1.  By  definition  G'  is  {p  +  s,  l)-rigid. 
By  Lemma  4  of  [13]  G'  is  {q,s  +  l)-rigid.  Thus  by  theorem  3.1  G  must  be  (q,p  +  s)-rigid.  ■ 


Next  we  address  connectivity. 

Theorem  4.2  A  (q,  p) -connected  graph  is  (q  —  s,  p  +  s)-connected  for  all  0  +,s  <  q. 

Proof:  All  it  takes  is  to  prove  the  result  for  s  =  1 .  Then  a  simple  induction  will  complete  the  proof.  We  first  assert  that  if  a  graph 
G  =  (V,E)  is  connected  then  for  any  E\  a  set  of  edges  incident  on  the  elements  of  V,  G  =  ( V.  E  |J  £j )  is  connected.  Suppose  now 
G  is  (q.  p) -connected.  To  prove  the  result  we  need  to  show  that  the  induced  subgraph  G'  =  (V' .  If)  obtained  by  removing  any 
q  —  2  vertices  and  p  I  edges  is  (2,2)  connected.  By  hypothesis  G'  is  (2,  l)-connected.  To  establish  a  contradiction  suppose 
the  removal  of  an  edge,  e  G  E',  makes  G"  =  (V1 .  If  \e)  not  connected.  This  must  be  false  as  by  Bolobas,  [3],  Theorem  2.5,  in 
the  (2,  l)-connected  graph  G'  any  pair  of  vertices  must  have  at  least  two  disjoint  paths  connecting  them.  The  result  follows.  ■ 


We  conclude  this  section  by  the  following  theorem  that  directly  follows  from  Theorems  4.1  and  4.2,  together  with  the  charac¬ 
terization  of  redundnant  global  rigidity  in  terms  of  redundant  rigidity  and  connectivity  of  Theorem  3.4. 

Theorem  4.3  A  graph  G  =  {V,E),  with  \V  |  ^  p  +  q  +  3  that  is  (q  +  s,p)- globally  rigid  for  some  s  >  0,  is  (q,p  +  s)- globally 
rigid. 
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5  Redundant  Edge  Rigidity:  The  Laman  Approach 

This  section  focuses  on  providing  a  Laman  type  necessary  and  sufficient  condition  for  (1  ,p) -rigidity.  First  we  need  a  definition 
of  minimal  (l,p)-rigidity.  Unsurprisingly,  when  p  =  1,  it  is  consistent  with  the  standard  notion  of  (nonredundant)  minimal 
rigidity. 

Definition  5.1  A  graph  G  =  ( V,E )  is  minimally  (1  ,p)-rigid  if  it  is  (1,  p)-rigid  and  loses  rigidity  after  the  removal  of  any  set  of 
p  edges. 

We  first  provide  a  necessary  and  sufficient  condition  for  minimal  (l,p) -rigidity. 

Theorem  5.1  A  graph  G  =  (V,E)  is  minimally  (1  ,p)-rigid  if  and  only  if  both  the  conditions  below  hold: 

(a)  \E\=2\V\-4  +  p. 

(b)  For  any  X  C  V,  1  <  |X|  <  |V|,  iG(X)  <  2\X\  -  3. 

Proof:  For  necessity  assume  that  G  is  minimally  (l,p)-rigid.  To  prove  necessity  of  (a),  observe  that  if  \E\  <  2\V\  —4  +  p  then 
removal  of  p  —  1  edges  will  result  in  a  subgraph  with  fewer  than  2\V \  —  3  edges  rendering  the  subgraph  nonrigid.  Now  suppose 
|£|  >  2\V |  —4  +  p,  and  the  graph  is  rigid.  Then  the  removal  of  at  least  one  set  of  p  edges  will  retain  rigidity.  Hence  for  minimal 
(l,p) -rigidity  (a)  is  necessary.  Suppose  now  the  graph  is  minimally  (l,/?)-rigid  and  (a)  holds  but  (b)  is  violated.  This  means 
there  is  a  X  C  V,  1  <  |X|  <  \V |,  such  that  iG(X)  >  2\X\  —  3.  Now  (1  ,p)-rigidity  must  imply  that  every  vertex  has  at  least  p+  1 
edges  incident  on  it.  Now  remove  p  —  1  edges  from  any  such  vertex  not  in  X.  Call  the  resulting  edge  set  if .  Clearly  by  (a) 
\E'\  =  2\V\  —  3.  The  graph  G'  =  ( V. If )  by  assumption  is  minimally  rigid.  Observe  that  the  edges  removed  from  E  to  obtain 
E'  cannot  be  in  the  subgraph  induced  by  X  in  G.  Thus  i&(X)  =  icf  X)  >  2\X\  —  3,  establishing  a  contradiction  by  violating 
Laman’s  theorem. 

To  prove  sufficiency  observe  that  as  (a)  is  necessay,  if  G  is  ( 1 ,  /;)-rigid  then  it  is  minimally  so.  Thus  to  prove  sufficiency  we 
need  simply  show  that  (a,b)  imply  (1  ,p) -rigidity.  Indeed  remove  any  p  —  1  edges  from  E.  Call  the  resulting  edge  set  E' .  Clearly 
\E'\  =  2\V\  —  3.  Further  because  of  (b),  for  the  graph  G'  =  (' V,E and  all  X  C  V,  1  <  |X|  <  |V|,  one  has  /G/(A)  ^  iG{X)  ^ 
2\X\  —  3.  Hence  G'  is  minimally  rigid. 


This  result  is  in  fact  quite  powerful,  in  that  beyond  a  precise  edge  count,  all  it  requires  is  that  Laman’s  condition  be  satisfied  by 
the  graph  induced  by  any  proper  subset  of  V. 

Rigidity,  as  opposed  to  minimal  rigidity,  is  characterized  through  Laman’s  theorem  using  the  property  that  for  a  rigid  graph 
G  =  (V,E)  there  must  exist  an  If  C  E,  such  that  G  =  ( V.  E' )  is  minimally  rigid.  The  example  depicted  in  figure  3(a)  shows 
however,  that  this  is  not  true  in  general  for  (1,^) -rigidity. 

To  analyze  the  graph  G  =  (V.E),  depicted  in  figure  3(a)  observe  that  every  vertex  in  the  first  and  last  rows  of  this  nine  vertex 
graph  has  an  edge  to  every  vertex  in  the  second  row.  Thus  in  all  there  are  18  edges  in  this  graph.  Obviously,  as 

18  >  2  x  9  —  4  +  2, 

this  graph  cannot  be  minimally  (l,2)-rigid.  It  is  also  clear  that  there  is  no  E'  C  E  such  that  (V. If  )  is  minimally  (l,2)-rigid. 
This  is  so  as  were  such  an  E'  to  exist  it,  E\E'  must  have  precisely  two  elements  in  it.  It  is  clear  that  the  removal  of  any  edge 
from  G  leaves  at  least  one  vertex  with  degree  equal  to  two.  Then  no  matter  what  further  edge  is  removed,  the  resulting  subgraph 
cannot  be  (l,2)-rigid. 

We  now  assert  that  G  is  in  fact  (l,2)-rigid.  To  see  this  first  observe  that  the  graph  induced  by  the  vertices  {l,---  ,6}  and 
that  induced  by  {4,  •  •  •  ,9}  are  each  minimally  rigid.  Indeed  the  graph  in  figure  3(b)  is  minimally  rigid.  The  graph  induced  by 
{ 1 ,  •  •  -  ,6}  can  be  built  from  this  by  two  edge-splitting  operations:  First  add  vertex  1  with  its  three  edges.  Remove  edge  e 45. 
Second  add  vertex  3  with  its  three  edges.  Remove  edge  e 65.  The  minimal  rigidity  of  the  graph  induced  by  {4,  •  •  •  ,9}  similarly 
follows. 

Also  note  that  there  is  a  complete  symmetry  between  the  edges,  in  that  if  we  show  that  the  graph  obtained  by  removing  the 
edge  e\4  is  rigid,  this  implies  that  a  graph  obtained  by  removing  any  single  edge  is  rigid.  Now  remove  the  edge  e\4.  Since  the 
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(a) 


Fig.  3.  (a)  A  (l,2)-rigid  graph  that  does  not  contain  a  minimally  (l,2)-rigid  subgraph;  (b)  A  graph  used  in  proving  claim  for  graph  of  (a). 

graph  induced  by  {4,-  •  •  ,9}  is  minimally  rigid,  the  graph  (V,E\  {<?  14,^25,636})  is  minimally  rigid,  being  obtainable  by  three 
edge  splitting  Henneberg  operations,  [7].  Thus  the  graph  ( V.  E  \  { e 1 4} )  is  rigid.  Thus  indeed  the  graph  in  figure  3  is  (l,2)-rigid. 

Despite  the  above  observation  that  a  ( 1 ,  y?)-rigid  graph  may  not  contain  a  minimally  ( 1 ,  /t  Frigid  graph,  the  following  theorem 
does  hold. 

Theorem  5.2  A  graph  G  =  (V,E)  is  (1  ,p)-rigid  if  there  is  an  E'  C  E,  such  that  G  =  ( V,E ')  is  minimally  (1 1p)-rigid. 

Proof:  Suppose  there  is  an  E'  C  E,  such  that  G  =  ( V7.  E' )  is  minimally  ( 1  ,/?)-rigid.  Consider  any  Ep  C  E,  comprising  p  -  1 
elements.  Observe  \Ep  0 E'\  <  p.  Thus  G  =  (V,E'\Ep)  is  rigid.  Consequently  as  adding  edges  to  a  rigid  graph  retains  rigidity 
and  E'\Ep  C  E  \Ep,  it  follows  that  G  =  (V.  E  \  Ep)  is  rigid,  proving  the  result.  ■ 


Evidently  then  there  are  two  distinct  classes  of  (l,/?)-rigid  graphs.  The  first  which  can  be  made  minimally  ( 1  ,/?)-rigid  after 
removing  some  edges.  The  other  is  exemplified  by  the  example  of  figure  3(a).  It  is  thus  helpful  to  formalize  this  distinction 
through  two  separate  classifications  of  (\,p) -rigidity. 

Definition  5.2  A  graph  G  =  {V,E)  is  strongly  (1  ,p)-rigid  if  there  exists  an  E'  C  E,  such  that  G  =  (V,£’/)  is  minimally  (1,/?)- 
rigid. 

By  contrast  we  have  the  class  represented  by  the  example  in  figure  3(a). 

Definition  5.3  A  graph  G  =  (V,E)  is  weakly  (1  ,p)-rigid  if  it  is  (\1p)-rigid  but  for  every  E'  CE,  G=  ( V,E ')  is  not  minimally 
(1  ,p)-rigid. 
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Evidently  using  Theorem  5.2  one  arrives  at  the  Laman  type  characterization  of  strong  ( 1 ,  /?) -rigidity  provided  in  Theorem  5.3 
below,  that  is  a  direct  analogy  of  Theorem  2.1.  On  the  other  hand  a  Laman  type  characterization  of  weak  (l,p)  -rigidity  remains 
open. 

Theorem  5.3  A  graph  G  =  (V,E)  is  strongly  (1 1p)-rigid  if  and  only  if  there  is  an  E'  C  E,  such  that  the  subgraph  G'  =  ( V,E ') 
obeys  both  the  conditions  below: 

(a)  \E'\  =  2\V\  —  4+p. 

(b)  For  any  X  cV,  1  <  |X|  <  |V|,  %  2\X\  -  3. 


6  Redundant  Edge  Rigidity:  The  Lovasz- Yemeni  Approach 

In  this  section,  we  characterize  (1,/?) -rigidity  using  the  Lovasz- Yemeni  approach.  As  in  section  5  we  will  obtain  the  result 
attending  first  to  minimally  (l,p) -rigid  graphs. 

Lemma  6.1  Let  G  =  (V,E)  be  a  minimally  (1 1p)-rigid graph.  Then  rp(G )  =  2\V\  —4  +  p,  where  rp(G )  is  defined  in  Definition 
2.3. 

Proof:  As  the  graph  is  minimally  (l,p)-rigid,  from  Theorem  5.1,  \E\  =  2\V\  —4  +  p.  Now  consider  F\ , ■  ■  ■  , f), 

/  ^  P ,  (6.2) 

with  each  Fi  a  nonempty  subset  of  E.  Suppose, 


Ue=£. 


(6.3) 


i=t 


Then  from  Definition  2.1, 

P  =  {Gi  =  (VG(Fi),Fi)}l^1  (6.4) 

is  an  /-AD  of  G.  Given  that  /  ^  p,  any  /-AD  is  also  a  p-AD.  Thus  (6.4)  is  a  p-AD  under  (6.2-6.4).  Henceforth  assume  (6. 2-6.4) 
is  in  force.  We  will  consider  two  cases  that  cover  all  possibilities. 

Case  I:  For  some  i 

VG(Fj)  =  V. 

Observe  that  for  any  j  f  i,  and  nonempty  Fj,  2  VG{Ff)  —  2>fi\.  Thus  as  (6.2)  holds  the  corresponding  r(P)  fi  (2|  V|  —  3)  +  /  —  1  ^ 
2\V\-4  +  p. 

Case  II:  There  holds: 

VG(Fi)?Wi  (6.5) 

Call  the  class  of  F  =  {F\ ,  •  •  •  ,  Fj\  that  lead  to  (6. 2-6. 5),  .  Observe  from  (6.5)  that  for  each  /,  and  F  €  |VG(/^-)|  <  |V|.  By 

identifying  VG{F,)  with  X  in  Theorem  5.1,  we  see  that  \Fj\  ^  2\VG{Fi)\  —  3,  i.e.  each  Fi  is  a  maximally  independent  subset  in 
G,-.  Consequently,  by  Lemma  2.2,  for  each  /,  G;  =  ( VG{Fj),Fj )  obeys  r\ ( G,- )  =  |E)|.  Then  there  holds: 


rp(G)  =  tnin  <|  Y,r\((VG(Fi)^Fd)  (  =  1™"  \  £l^i  \  >  \E\  =  2\V\-4  + p. 


.  i=i 


Fe&  | 


(6.6) 


It  remains  to  show  that  there  is  a  p-AD  whose  index  is  exactly  2|  V |  —  4 +p.  Indeed  label  the  elements  of  E  as  e\ ,  •  •  •  , e2|v|-4+p, 
and  choose  P"  =  {{VG({ei}),  {e,-})}^  4+p.  Then  as  for  each  i,  2|Vc;({e/}) |  —  3  =  1,  r(P")  =  |£|  =  2|V|  —  4  +  p.  ■ 


We  next  derive  a  result  that  shows  that  rp(G)  fi  2\V\  —  4  +p  is  a  sufficient  condition  for  (1  ,p) -rigidity. 
Lemma  6.2  Let  G  =  (V,E)  be  a  graph  with  rp(G)  fi  2\V\  —4  +  p.  Then  G  is  ( \1p)-rigid . 
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Proof:  Suppose  the  graph  is  not  (l,p)-rigid.  Then  there  exists  an  E'  obtained  by  removing  an  edge  set  Ep  comprising  p  -  1 
edges,  from  E  such  that  G'  =  (V,  If)  is  nonrigid.  Thus,  from  Theorem  2.4,  there  is  an  AD  of  G',  call  it  P,  such  that  r(P)  + 
2\V\  —  4.  Now  consider 


p'=  U  {(ye(M)>M)}  Up' 


i  €(z:En 


Observe  |Vg({s})|  =  2  for  each  e  G  Ep.  As  P'  is  a  p- AD  of  G,  from  (2.1),  we  obtain: 


rp{G)  <  r(P')  =  r(P)  +  £  (2|VG({c})|  -  3)  <  r(P)  +p-  1  <  2\V\-5  +  p 

eeEp 


establishing  a  contradiction. 


Combining  these  two  Lemmas  we  obtain  the  following  necessary  and  sufficient  condition  for  minimal  ( 1  ,p)-rigidity. 
Theorem  6.1  A  graph  G  =  (V,E)  is  minimally  (1  ,p)-rigid  iff  rp(G)  ^  |.E|  =  2|V|  —4  +  p. 

Proof:  By  Theorem  5.1,  condition  (a)  and  Lemma  6.1,  minimal  (l,p)-rigidity  implies  rp(G)  =  \E\  =  2\V \  —  4  +  p.  Conversely, 
suppose  that  rp(G )  ^  |£|  =  2|V|  —4  +  p.  By  Lemma  6.2,  G  is  (l,p)-rigid.  Now  argue  as  in  the  proof  of  Theorem  5.1:  as 
|£|  =  2|V|  —  4  +  p,  the  removal  of  any  set  of  p  edges  will  give  a  graph  with  2|V|  —4  vertices,  i.e.  a  nonrigid  graph.  This 
establishes  minimal  rigidity.  ■ 


The  question  remains  whether  rp[G)  =  2|V|  +p  —  4  is  a  necessary  condition  for  nonminimal  ( 1 , p ) -rigidity  (assuming  that 
jLl  2|V|  +p  — 4).  The  graph  G  =  (V,E)  in  figure  4  serves  as  a  counter  example. 


Fig.  4.  A  (l,2)-rigid  graph  with  iffP)  <  2\V\  —2. 

Consider  the  three  edge  partitions:  E\  =Eq{{\121A15}),E2  =£’g({3,7,4,8})  and  £3  =£’g({8,9,6,5}).  P  =  {(Vg(£i')>^i)}i=i 
is  a  2- AD  for  G.  In  this  case: 

r(P)  =  3  x  (8  -  3)  =  15  <  2  x  9  -  2  =  16. 

The  graph  is  clearly  rigid  after  removing  any  single  edge  other  than  from  the  set  645 ,  e4% ,  .  Now  without  sacrificing  generality 

remove  the  edge  645.  Observe  it  can  be  reconstructed  by  an  edge-splitting  operation  on  ( V  \  {2},Eq(V  \  {2})).  Thus  this  graph 
is  (l,2)-rigid. 

Nonetheless  the  obvious  Lovasz- Yemeni  type  necessary  condition  does  hold  for  strong  (l,p)-rigidity,  as  we  now  see. 

Theorem  6.2  A  graph  G  =  (V,E)  with  rp{G)  ^  2|V|  —4  +  p  is  ( \1p)-rigid .  It  is  strongly  (1  ,p)-rigid  only  if  rp{G)  ^  2|V|  — 
4  +  p.. 
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Proof:  The  first  statement  follows  from  Lemma  6.2. 

We  will  use  induction  to  prove  the  second  claim.  Clearly  from  Theorem  2.4  the  graph  G  =  (V,E)  is  (1,  l)-rigid  only  if  r\  ( G )  f 
2\V\  —  4+  1.  Now  suppose  for  all  1  ^  l  ^  p  —  1  the  graph  G  =  (V.E)  is  strongly  (1,  /)-rigid  only  if  n(G)  ^  2|V|  —  4  +  /. 

Suppose  the  graph  G  =  (V,E)  is  ( I  ./?)-rigid  but  rp(G)  <  2|V|  —4  +  p.  Then  from  Lemma  6.1,  it  is  not  minimally  rigid.  From 
Theorem  5.2  there  is  an  E'  C  E,  such  that  G'  =  (V. E')  is  minimally  ( I . /? j-rigid.  Call  F  =  E\E' .  Now  consider  any  /;-AD, 
P  =  {(Vg(£’i),£’i)  •  •  •  ,  (Vg(E]c),Eic)}  of  G.  By  definition  k  ^  p.  Further: 

k 

U  E,=E. 

i=t 


Thus: 

i=i 

Without  sacrificing  generality  label  the  nonempty  sets  among  Ej  \F  as  E\  \  F,  ■■  ■ Em\F ,  with  I  T  m  T  k. 

Then  P'  =  {(Vg^i  \F),E\  \F)  ■  ■  ■  ,  ( Vo(Em\F),Em\F )}  is  an  w-AD  of  G' .  Observe  also  that: 

|yc(£,  \^)l  <  \Vo(Ei)\  VI  <  i  <  m  (6.7) 

and 

| VG(Ei)\  >  2  Vm  +  1  <  i  <  k.  (6.8) 

thus  one  obtains 

r(P)  =  £[2|yG(£,)|-3] 

i=  1 

m  k 

>  £  [2\VG(Ei\F)\  -  3]  +  £  [2\VG(Ei)\-  3] 

i=  1  i=m 


r(P)^  r(P')+k  —  m.  (6.9) 

Now  supposoe  m  <  p  (the  contrary  case  is  considered  below).  Then  the  graph  G(V,E')  is  (l,m)-rigid  by  the  inductive  hypoth¬ 
esis: 

r(P')  ^  2|y|  —  4+m. 

Thus  from  (6.9)  there  holds, 

r(P)  ^  r{P')  +k-m^2\V\—4  +  m  +  k  —  m  —  2\V\—4-\-k^2\V\—4  +  p. 

On  the  other  hand  if  m  V  P  then  r(P)  is  still  no  less  than  r(P')  as  (6.7)  and  (6.8)  continue  to  hold.  Consequently  from  Lemma 

6.1, 

r(P)  ^r(P')  ^2\V\-4  +  p.  (6.10) 

The  result  follows. 


Thus  while  both  Lovasz- Yemeni  and  Laman  type  necessary  and  sufficient  conditions  exist  for  minimal  (l,p) -rigidity,  and  are 
necessary  for  strong  (1,/?) -rigidity,  these  only  provide  sufficient  conditions  for  weak  ( 1 ,  /?)-rigidity. 

Recall  however,  that  the  direct  subject  of  investigation  in  this  paper  is  redundant  global  rigidity,  which  beyond  redundant  rigidity 
also  require  redundant  connectivity.  It  is  instructive  to  note  that  the  example  of  Fig.  3  is  (3,1 )- connected ,  though  that  of  figure 
4  is  not.  It  is  also  possible  to  show  that  the  graph  in  Fig.  3  does  satisfy  the  corresponding  Lovasz- Yemeni  sufficient  condition 
for  (1,2) -rigidity.  Indeed  the  next  section  demonstrates  that  a  (l,/?)-rigid  graph  that  is  not  necessarily  strongly  ( I . /? ) -rigid  but 
has  an  additional  redundant  connectivity  property  must  have  r„(P)  f  2\V\  —4  +  p. 
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7  Lovasz- Yemeni  type  condition  for  redundant  rigidity  with  redundant  connectivity 

This  section  considers  levels  of  connectivity  needed  for  a  (l,p)-rigid  graph  to  satisfy  rp(P)  ^  2|V|  —  4  +  p.  At  the  end  of  this 
section  we  will  tie  the  results  here  to  redundant  global  rigidity. 

We  first  present  a  Lemma  that  shows  that  for  any  p  ^  1,  at  least  one  p-minimizing  AD  of  G  =  (V,E)  must  have  underlying 
edge  sets  that  partition,  rather  than  merely  cover,  E.  This  theorem  thus  extends  a  simplifying  conclusion  of  Theorem  2.4  to  the 
case  of  p  >  1 . 

Lemma  7.1  Suppose  P  =  {  (Vg  (£,•),  j,  a  k-AD  of  G  =  (V,E).  Then  there  are  £),•••  .  E^,  that  partition  E,  such  that  for 
P  =  {  (Vq  {Ef) ,  £; )  }^_  i ,  there  holds: 

r(P)^r{P). 

Proof:  Since  P  is  a  k-AD,  the  E\  are  all  nonempty  and  form  a  cover  of  E.  Thus  there  exist  mutually  disjoint,  nonempty 


Ei  C  Ej 


such  that  the  E,  partition  E.  Then  the  result  follows  by  noting  that  Vc,  ( Ei )  C  VG{Ei). 


Because  of  this  we  will  assume  henceforth  that  the  edge  covers  underlying  the  ADs  we  consider  are  disjoint.  Since  our  goal 
here  is  to  tie  Lovasz- Yemeni  type  conditions  to  a  coupling  of  redundant  connectivity  and  redundant  rigidity,  the  next  lemma 
records  a  key  consequence  of  redundant  connectivity. 

Lemma  7.2  Suppose  G  =  (V,E)  is  (, q ,  1  )-connected.  Consider  P  =  {(Vg(£,,'),£,1')};=1,  a  k-AD  of  G,  with  Ej  mutually  disjoint 
and  k^2.  Suppose  for  some  i  that  \Vc{Ej)\  ^  q  and  \  Vq{E \£))|  ^  q.  Then  \Vq{Ei)  fl  Vq(E  \£))|  ^  q. 

Proof:  Consider  Fig.  5,  which  depicts  the  three  vertex  sets,  Vg(£,)  \  Vg{E \Ej),  Vc(Ej)  fl Vo(E\Ei)  and  Vc(E\Ej)  \Vc(Ei). 
Observe  all  paths  from  vertices  in  the  leftmost  set  to  those  in  the  rightmost  are  through  the  middle  set.  To  establish  a  contradic¬ 
tion  suppose  the  middle  set  has  fewer  than  q  vertices,  then  the  removal  of  these  vertices  makes  the  middle  set  empty,  making 
the  resulting  graph  not  connected.  This  contradicts  the  fact  that  the  original  graph  is  (q.  l)-connected. 


Fig.  5.  Illustration  of  the  proof  of  Lemma  7.2. 

Next  we  record  a  simple  lemma  that  will  be  exploited  in  the  sequel. 

Lemma  7.3  Consider  a  set  X  and  subsets  X\  .■  ■  ■  .  X^,  k  >  I  that  cover  X.  Suppose  for  all  i, 

XiClJXj.  (7.11) 

Mi 

Then  there  holds: 

k 

2|*l<L|Xi|  (7-12) 

i=i 
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Proof: 

Since  X\ ,  •  •  •  ,A*  cover  X,  (7.11)  ensures  that  every  element  of  X  appears  in  at  least  two  sets  among  the  A,. 


Much  of  the  analysis  in  the  remainder  of  this  section  hinges  on  the  following  pivotal  Lemma. 

Lemma  7.4  Consider  a  set  V  and  subsets  Vi ,  •  •  •  ,  Vjt,  k  >  1  that  cover  V.  Suppose  for  some  q  and  all  i,  |  V,-  0  {Uj^i  V/}  |  q- 

Then  there  holds: 

k 

2'E\Vi\-2\V\>qk.  (7.13) 

i=  1 


Proof:  Define 

and 

Also  denote: 

and 


x‘  =  <j  |J Vj 
Si  =  Vi\Xi. 

k 

i=  1 


x  =  \jxu 

i=i 


From  (7. 14)  and  the  theorem  statement  it  is  evident  that 


\Xi\  >  q- 


SiCXj  =  {(j)}, 


Stnsj  — 


From  (7.14)  and  (7.15)  it  is  also  clear  that  for  all  ij , 

for  all  i  f  j, 
and  for  all  i, 

l^l  =  |5i|  +  |Xi|. 

From  (7.19)-(7.21)  one  can  verify  after  some  manipulation  that  for  all  i 

y;n  j(JV/j  =-X/n||Jx/|. 

Consequently  A,  obey  (7. 1 1)  and  Lemma  7.3  applies. 

Further,  the  .S',  partition  S,  while  the  X\  cover  A,  and  V  =  S  LJ  A .  Then  because  of  (7.19)  and  (7.20)  there  holds: 


m  = 


=Eis,-i+m. 

i=  1 


(7.14) 

(7.15) 

(7.16) 

(7.17) 

(7.18) 

(7.19) 

(7.20) 

(7.21) 
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Thus,  because  of  (7.12)  and  (7.19),  and  Lemma  7.3,  there  holds: 

i=  1  i—  1  i=  1 

k  k  k 

=  Lis.-i+lw-LiSii-w 

i=  1  i=l  i=  1 

/=i 

i  * 

z  i=i 

Thus,  because  of  (7.18)  we  obtain: 

k  k 

2X>|-2|V|>£|X,|^- 

i=.l  j=l 


We  then  have  the  first  result  connecting  Yemeni-Lovasz  type  conditions  to  (l,p)-global  rigidity  for  p  f  4. 


Theorem  7.1  Suppose,  for  integer  p  with  2  ^  p  ^  4,  G  =  (V,E)  is  (3,p—  \)-connected  and  (1  ,p)-rigid.  Then  rp{G)  ^  2|V |  — 
4  +  p. 

Proof:  We  will  use  induction.  To  obtain  the  basis  step  consider  first  p  =  2.  Suppose  G  =  ( V,  E )  is  (3, 1) -connected  and 
(l,2)-rigid.  (In  fact,  this  means  that  G  is  globally  rigid;  we  will  explore  this  issue  later  in  more  detail).  Suppose  for  k  f  2, 
P  =  {(VcfEf.Ei)}^  !  is  a  GAD  of  G,  with  E,  partitioning  E.  Then  by  Lemma  7.1  n(G)  is  the  index  of  one  such  GAD. 

Suppose  first  that  for  some  i,  |£)|  =  1.  Without  loss  of  generality  assume  i  =  1.  Since  G  is  (l,2)-rigid,  G'  =  ( V.  E  \  E \ ) 
is  rigid.  Since  the  Et  are  disjoint,  for  all  i  >  I  Et  \  E\  =  E,  are  therefore  all  nonempty  and  partition  E\E\.  Thus,  Pi  = 
{(Va{Ei\Ei)  ,Ef\Ei)}f_2,  is  a  (k—  1)-AD  of  G' .  Since  G'  is  rigid,  by  Theorem  2.4 


r(P')^  2\V\-3. 


Hence: 

r(P)  =  r(P')  +  2\VG(El)\-3>2\V\-2, 
i.e  all  such  P  have  indices  that  exceed  or  equal  the  hypothesized  value  of  ^(G). 

There  remains  the  case  where  all  E, |  >  1.  In  this  case  for  all  i,  |Vg(£))|  ^  3.  Further,  as  the  £)  partition  E,  for  all  i  there  holds 


VG(E\Ei)=VG 


=  {JVc(Ej). 


(7.22) 


As  the  graph  is  (3,  l)-connected,  by  Lemma  7.2,  for  all  i. 


\VG(E\Ei)nVG(Ei)\>3. 


(7.23) 
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Thus,  identifying  V,  =  yG(P<),  conditions  of  Lemma  7.4  hold  with  q  =  3.  Consequently, 


k 


r(P)  =  £(2|^-|-3) 

i=i 

k 


=  2£|y|-3A 

i=l 


3s2|y|+3A-3A 

>2|y|-2 


Now  for  the  induction  step,  suppose  the  result  holds  for  some  p,  2  <  P  <  4.  Suppose  G  =  (V,E)  is  (3,p)-connected  and 
(1  ,p  +  l)-rigid.  Consider  for  k  f  p  +  1,  P  =  {(\/(;(Ej),E!)}kj_^  a  A- AD  of  G,  with  P,  partitioning  E.  Then  by  Lemma  7.1 
rp+i  (G)  is  the  index  of  one  such  A-AD. 

Suppose  first  that  for  some  i,  |£)|  =  1.  Without  loss  of  generality  assume  i  =  1.  Since  G  is  (3,p)-connected  and  (  \  ,p  +  l)-rigid, 
G'  =  (V1E\E\)  is  (3,  p  l)-connected  and  (l,p)-rigid.  Since  the  P,  are  disjoint,  for  all  i  >  1  Ej  \  E  \  -  E,  are  therefore  all 
nonempty  and  partition  E\E\.  Thus,  Pi  =  {(yG(P,- \E\),Ej \Pi)}f=2,  is  a  (A  —  1)-AD  of  G' .  By  the  induction  hypothesis: 

r{P')  ^2|V|— 4  +  p. 

Hence: 

r(P)  =  r(P')  +  2|yG(P1)|-3>2|y|-4  +  jp+l, 

i.e  all  such  P  have  indices  that  exceed  or  equal  the  hypothesized  value  of  rp+i(G).  There  remains  the  case  where  all  Ej  >  1. 
In  this  case  for  all  i,  | yG(£))|  3.  Further,  as  the  P,  partition  E,  for  all  i,  (7.22)  holds.  Further,  a  (3,p)-connected  graph  is  also 

(3,  l)-connected.  Then,  by  Lemma  7.2,  for  all  i,  (7.23)  holds. 

Thus,  identifying  y  =  V(;  ( P, ) ,  conditions  of  Lemma  7.4  hold  with  q  =  3.  Consequently,  as  p  <  4, 
r(P)  =  £(  2|y|-3) 

i=  1 

=  2£  |y|-3A 

(=1 

^  2|y|  +3A  — 3A 
^2|y|  +  i+^-4 

Thus  the  result  follows.  ■ 


There  remains  the  question  whether  the  result  of  Theorem  7.1  will  hold  for  p  >  4.  We  now  present  a  counterexample  for  p  =  5. 

Example  7.1  The  graph  in  question,  G  =  ( V,E )  has  |y  |  =  33,  and  has  a  vertex  cover  Vi ,  •  •  •  ,y 6,  such  that  each  |yj  =  7,  each 
subgraph  ( Vi,Ec(Vi ))  is  a  Kq  graph,  and  the  sets  Ej  =  E^Vi)  consitute  a  partition  ofE.  Further,  defining 

Xj  =  V,-  n  (Uj#Vj) ,  (7.24) 

one  has  X\  =  {1,2,3},  Xq  =  {1,4,5},  X3  =  {2,6,7},  X4  =  {3,8,9},  X5  =  {4,6,8}  and  X &  =  {5,7,9}.  In  other  words  V, 
comprises  Xj  together  with  four  other  vertices  that  are  not  in  any  other  Vj.  Thus  there  are  24  vertices  each  belonging  to  just 
one  Vj,  and  9  which  are  in  exactly  two. 

To  see  that  this  choice  ofXj  is  consistent  with  the  fact  that  the  Ej  partition  E,  observe  for  all  i  j, 


\XjnXj\  <  1.  (7.25) 

Further  vertices  in  each  Vj  connect  to  the  others  through  the  elements  ofXj.  Thus,  indeed  this  defintion  is  consistent  with  the 
requirement  that  the  Ej  are  disjoint. 
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In  effect  then  G  comprises  six  Kj  subgraphs,  connected  through  the  vertices  in  the  Xj.  Each  Vi  has  three  vertices  through  which 
it  connects  to  other  Vj,  there  being  one  vertex  in  common  with  each  of  three  different  Vj. 


It  is  shown  in  the  appendix  that  G  is  both  (3,4 )-connected  and  (1,5 )-rigid.  Since  each  subgraph  (Vi,Ec(Vi))  is  Ki,  it  also 
follows  that  Vj  =  Vc(Ei).  Thus  as  the  E\  partition  E,  we  have  that 


p=m,EG(vi))}il 


is  a  6- AD  of  G.  Hence, 


rs(G)  ^r(P)  =  6(14  —  3)  =66  =  2|V|  <2|V|  —4  +  5, 


and  Theorem  7.1  is  violated  for  p  =  5. 


Nonetheless  we  now  show  that  for  p  >  4,  a  stronger  redundant  connectivity  condition  suffices  for  rp(G)  ^  2|V|  —  4  +  p  to  hold 
given  (l,p) -rigidity. 

Theorem  7.2  Suppose,  for  integer  pT?  3,  G  =  (V,E)  is  (4 ,p  —  2)-connected  and  (1  ,p)-rigid.  Then  rp(G)  ^  2|V|  —  4  +  p. 

Proof:  From  Theorem  4.2,  a  (4, p  —  2)-connected  graph  is  also  (3,/;  I  (-connected.  Thus  from  Theorem  7.1,  the  result  holds 
for  3  <  p  <  4. 


To  use  induction  suppose  the  result  holds  for  all  3  ^  p  ^  n,  n  f  4.  Suppose  G  =  (V,E)  is  (4 ,n  —  l)-connected  and  (l,n+  1(- 
rigid.  Consider  for  k  Js  n+  1,  P  =  {(Va(Ei),Ei)}^=t,  a  A- AD  of  G,  with  E,  partitioning  E.  Then  by  Lemma  7.1  rn+  \  (G)  is  the 
index  of  one  such  A- AD. 


Suppose  first  that  for  some  ;,  Vq l P, ) |  =  3.  Then  \E,\  f  3.  Without  loss  of  generality  assume  i  =  1. 

Consider  G'  =  (V,E  \E\).  Since  the  £j  are  disjoint,  for  all  i  >  1  £)  \£j  =  Ej.  Thus,  Pi  =  {(Vg(P,'\Pi),P,'\Pi)};=2’  *s  a 
(k—  1)-AD  of  G' . 

First  suppose  n  =  4.  Since  by  the  induction  hypothesis  G  is  ( 1 , 5 (-rigid,  G'  is  (l,2)-rigid.  Further  as  G  is  (4, 3)-connected,  by 
Theorem  7.1  it  is  (3,4)-connected.  Consequently,  G'  is  (3,  l)-connected. 

Thus  by  Theorem  7.1,  applying  the  case  of  p  =  2  =  n  —  2, 

r(P')>2|V|-4  +  «-2.  (7.26) 


Hence  as  |Vg(£i)|  =  3, 


r(P)  =  r{P')  +2|Vg(£i)|  -3  >  2|V|  -6  +  n  +  3  =  2|V|  -  4+  (n+  1),  (7.27) 

i.e  all  such  P  have  indices  that  exceed  or  equal  the  hypothesized  value  of  rn+\  (G). 

Now  consider  n  >  4  ,  still  with  |yG(Pi)|  =  3.  In  this  case  as  G  is  (4,n  —  1  (-connected  and  (1  ,n  +  1  (-rigid,  G'  is  (4 ,n  -4)- 
connected  and  (1  ,n  —  2)-rigid.  As  n  —  4  ^  1,  by  the  induction  hypothesis  (7.26)  still  holds.  Hence  as  |VG(£j)|  =  3,  (7.27)  also 
holds  and  the  requirement  of  the  theorem  is  met. 

Thus  when  there  is  a  |VG(£,)|  =  3,  the  requirements  of  the  theorem  are  met.  An  identical  argument  given  in  the  proof  of 
Theorem  7.1  also  dispenses  with  the  case  when  for  some  i,  |yG(P,)|  =  2  as  this  is  tantamount  to  |£jj  =  1.  There  remains  the 
case  where  for  all  i,  |yG(£,)|  ^  4.  As  in  Theorem  7.1  (7.22)  holds.  Further,  as  n  ^  4,  G  is  (4,  l)-connected.  Then  by  Lemma 
7.2,  for  all  i, 

|yG(£\£j)nyG(£,-)|^4.  (7.28) 
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Thus,  identifying  V,-  =  W;  (/-’,),  conditions  of  Lemma  7.4  hold  with  q  —  4.  Consequently,  as  k  ^  n  +  1, 

r(P)  =  f(  2M-3) 

f=l 

=  2£|v;-|-3* 

1=1 

>  2|V|  -+-4A  —  3A 
=  2|v|+k 
^2|V|+n-3. 

The  result  follows. 


As  global  rigidity  is  equivalent  to  the  combined  properties  of  (3, 1) -connectivity  and  (1,2) -rigidity,  in  view  of  the  two  theorems 
in  this  section  we  have  the  following  key  result. 

Theorem  7.3  A  graph  G  =  (V,E)  is  (\,p)- globally  rigid  if  it  is  ( 3,p)-connected  and  rp+\ (G)  ^  2|V|  —  3  +  p.  For  1  ^  p  ^  3 
it  is  ( l,p)-globally  rigid  only  if  rp+\  (G)  ^  2|V|  —  3  +p.  For  p  ^  4  a  (4  ,p  —  2)-connected  G  =  (V,E)  is  (\,p)- glob  ally  rigid 
only  ifrp+i(G)  ^  2|V|  -3  +p. 

Proof:  Observe  first  that  ( I  ,/f)-global  rigidity  is  equivalent  to  global  rigidity  being  preserved  after  the  deletion  of  p  -  1 
edges,  or  having  (3, 1) -connectivity  and  (1,2) -rigidity  after  the  deletion  of  p—  1  edges,  or  having  initially  (3,/?) -connectivity 
and  (\,p  +  l)-rigidity.  By  Theorem  6.2,  rp+  \  (G)  ^  2|V|  —  4+  (/?  +  1)  ensures  (1  ,p  +  l)-rigidity  and  so  the  first  claim  of  the 
theorem  holds.  The  second  and  third  claim  follow  immediately  from  Theorems  7.1  and  7.2.  ■ 


Given  the  relation  between  global  rigidity,  anchor  count  and  localizability,  there  is  an  obvious  extension  of  this  to  a  correspond¬ 
ing  characterization  of  redundantly  localizable  two-dimensional  sensor  networks. 

We  comment  that  in  a  sensor  network  in  which  loss  of  nodes  is  contemplated,  it  may  be  that  loss  of  anchor  nodes  can  be  ruled 
out  on  reliability  or  other  grounds.  What  is  needed  for  redundant  localizability  in  addition  to  redundant  global  rigidity  is  that 
after  loss  of  up  to  q-1  nodes,  at  least  three  noncollinear  anchor  nodes  remain.  The  issue  of  which  nodes  fail  does  not  enter  the 
picture  when  the  redundancy  is  all  with  respect  to  edge  loss  of  course. 


8  Conclusions  and  Future  Work 

This  work  presents  fundamental  results  on  edge-redundant  global  rigidity,  which  is  essential  to  the  problem  of  seeking  a 
solution  to  guaranteing  network  localizability  in  the  event  of  link  losses.  The  particular  notion  that  is  studied  in  detail  is 
minimal  (l,/?)-rigidity;  for  which  two  different  types  of  characterizations  were  obtained:  Laman  type  and  Lovasz-Yemeni 
type.  For  the  latter,  further  connection  was  established  between  redundant  edge  rigidity  and  redundant  connectivity. 

As  discussed  in  the  beginning  of  this  paper,  it  is  of  great  importance  to  also  study  redundant  vertex  rigidity,  i.e.  (, q ,  l)-rigidity. 
Some  discussion  is  provided  in  the  text  but  its  characterization  remains  largely  open.  Also,  one  will  need  computationally 
efficient  algorithms  to  check  for  satisfaction  of  these  conditions  for  redundant  localizability  of  a  network.  Moreover,  operations 
are  required  for  one  to  construct,  augment,  merge  and  split  such  networks,  while  ensuring  the  level  of  redundant  localizability 
remain  unchanged.  Results  along  the  lines  of  [1]  providing  simple  sufficient  conditions  for  redundant  global  rigidity,  largely  in 
terms  of  local  properties  of  graphs,  would  also  be  welcome.  At  some  point  too,  random  geometric  graphs  should  be  investigated. 
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On  the  Performance  Limit  of  Sensor  Localization 
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Abstract — In  this  paper,  we  analyze  the  performance  limit 
of  sensor  localization  from  a  novel  perspective.  We  consider 
distance-based  single-hop  sensor  localization  with  noisy  distance 
measurements  by  Received  Signal  Strength  (RSS).  Differently 
from  the  existing  studies,  the  anchors  are  assumed  to  be  ran¬ 
domly  deployed,  with  the  result  that  the  trace  of  the  associated 
Cramer-Rao  Lower  Bound  (CRLB)  matrix  becomes  a  random 
variable.  We  adopt  this  random  variable  as  a  scalar  metric  for 
the  performance  limit  and  then  focus  on  its  statistical  attributes. 
By  the  Central  Limit  Theorems  for  [/-statistics,  we  show  that 
as  the  number  of  anchors  goes  to  infinity,  this  scalar  metric  is 
asymptotically  normal.  In  addition,  we  provide  the  quantitative 
relationship  among  the  mean,  the  standard  deviation,  the 
number  of  anchors,  parameters  of  communication  channels 
and  the  distribution  of  the  anchors.  Extensive  simulations  are 
carried  out  to  confirm  the  theoretical  results.  On  the  one 
hand,  our  study  reveals  some  fundamental  features  of  sensor 
localization;  on  the  other  hand,  the  conclusions  we  draw  can 
in  turn  guide  us  in  the  design  of  wireless  sensor  networks. 

I.  Introduction 

Location  information  plays  a  vital  role  in  the  applications 
of  sensor  networks,  for  it  is  useful  to  report  the  geographic 
origin  of  events,  to  assist  in  target  tracking,  to  achieve 
geographic  aware  routing,  to  manage  sensor  networks,  to 
evaluate  their  coverage,  and  so  on.  A  sensor  network  gen¬ 
erally  consists  of  two  kinds  of  nodes:  anchors  and  sensors. 
Anchor  positions  are  known  a  priori  (e.g.,  through  GPS  or 
manual  configurations),  while  sensor  positions  are  unknown 
and  need  to  be  determined  through  certain  procedures  of 
localization.  Up  to  now,  considerable  efforts  have  been 
invested  in  developing  sensor  localization  algorithms. 

Apart  from  designing  sensor  localization  algorithms,  the 
analysis  of  localization  performance  also  gains  much  at¬ 
tention.  Performance  studies  specific  to  sensor  localization 
algorithms  are  realized  to  evaluate  and  compare  different 
sensor  localization  algorithms.  More  importantly,  the  perfor¬ 
mance  limit  of  sensor  localization,  namely  the  lower  bound 
for  location  estimate  errors  produced  by  all  localization 
algorithms,  provides  a  theoretically  optimal  performance  no 
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matter  what  sensor  localization  algorithm  is  applied,  and  thus 
reflects  fundamental  impacts  of  various  factors  on  sensor 
localization  in  an  algorithm-independent  manner.  Due  to  the 
essence  of  Cramer-Rao  Lower  Bound  (CRLB),  it  has  been 
widely  used  to  characterize  the  performance  limit  of  sensor 
localization  [1], 

Most  of  the  existing  CRLB  analysis  is  based  on  given 
sensor-anchor  geometries.  In  this  paper,  we  analyze  the 
performance  limit  of  single-hop  sensor  localization  from 
a  novel  perspective.  As  commonly  used  in  the  literature, 
we  adopt  the  trace  of  the  associated  CRLB  matrix  as  a 
scalar  metric  for  the  performance  limit  of  sensor  localization 
[1],  However,  differently  from  existing  CRLB  studies  which 
require  exact  sensor-anchor  geometries  to  compute  the  deter¬ 
ministic  CRLB,  we  assume  that  a  fixed  number  of  sensors 
and  anchors  are  randomly  deployed  in  a  two-dimensional 
plane  with  distance  measurements  from  Received  Signal 
Strength  (RSS).  Consequently,  the  trace  of  the  associated 
CRLB  matrix  becomes  a  random  variable  with  respect  to 
the  sensor-anchor  geometries,  and  we  focus  on  the  statistical 
attributes  of  the  trace  of  the  CRLB. 

The  motivations  of  our  study  are  as  follows.  In  a  mobile 
environment,  such  as  ad-hoc  networks,  target  tracking,  Si¬ 
multaneous  Localization  and  Mapping  (SLAM)  [2],  mobile 
anchors  assisting  in  sensor  localization  [3]  and  so  on,  it  is 
trivial  to  concentrate  on  the  localization  performance  in  one 
particular  time  instant,  whereas  it  is  attractive  to  grasp  the  av¬ 
erage  localization  performance  over  a  period  of  time  and  in  a 
wide  region.  Hopefully,  this  can  be  solved  by  our  statistically 
modeling  method.  Furthermore,  the  advantages  of  our  study 
include:  (i)  it  provides  some  knowledge  about  how  the  scalar 
metric,  equivalently  the  minimal  mean  square  estimation 
error  (MSE),  is  distributed  over  all  possible  sensor-anchor 
geometries;  (ii)  the  mean  of  the  scalar  metric  reveals  how 
the  average  minimal  MSE  with  respect  to  all  possible  sensor- 
anchor  geometries  evolves  with  the  number  of  anchors,  the 
parameters  of  communication  channels  and  the  measurement 
noises;  (iii)  the  ratio  of  the  standard  deviation  to  the  mean 
indicates  the  sensitivity  of  the  minimal  MSE  to  sensor-anchor 
geometries;  (iv)  it  not  only  provides  insights  into  single¬ 
hop  sensor  localization  including  source  localization  and 
target  tracking  as  specific  cases,  but  also  as  a  prototype 
paves  the  way  for  dealing  with  more  complicated  scenarios 
of  sensor  localization.  In  summary,  statistical  sensor-anchor 
geometry  modeling  is  a  powerful  method  for  investigating 
the  performance  limit  of  sensor  localization,  which  is  an 
essential  problem  for  sensor  networks.  To  the  best  of  our 
knowledge,  this  method  has  never  been  considered. 

Essentially,  this  scalar  metric  is  a  function  of  U -statistics. 


978-1 -61 284-799-3/1 1/S26.00  ©2011  IEEE 


7870 


In  statistical  theory,  [/-statistics  introduced  by  the  seminal 
paper  [4]  are  a  class  of  important  statistics,  and  are  of  great 
significance  in  estimation  theory  in  that  asymptotic  properties 
of  both  estimators  and  test  statistics  have  been  derived  by 
using  the  Central  Limit  Theorems  for  [/-statistics.  Based  on 
the  theory  of  [/-statistics,  we  show  that  as  the  number  of 
anchors  goes  to  infinity,  this  scalar  metric  is  asymptotically 
normal.  We  provide  the  quantitative  relationship  among 
the  mean,  the  standard  deviation,  the  number  of  anchors, 
parameters  of  communication  channels  and  the  distribution 
of  the  anchors.  Since  our  results  are  based  on  an  asymptotic 
analysis,  the  conditions  under  which  our  results  approximate 
the  real  situations  well  are  identified. 

The  remainder  of  this  paper  is  organized  as  follows.  The 
next  section  introduces  the  problem  formulation.  Section 
III  presents  the  main  results  about  statistical  attributes  of 
performance  limits.  Finally,  we  conclude  this  paper  and  shed 
light  on  future  work  in  Section  V. 

II.  Problem  Formulation 

In  this  section,  we  formulate  the  scalar  metric  of  the 
performance  of  single-hop  sensor  localization  using  RSS 
measurements  and  define  a  random  sensor-anchor  geometry 
model.  Throughout  this  paper,  we  shall  use  the  following 
mathematical  notations:  (-)T  denotes  transpose  of  a  matrix 
or  a  vector;  Tr(-)  denotes  the  trace  of  a  square  matrix; 
Pr{-}  denotes  the  probability  of  an  event;  E(-)  denotes  the 
expected  value  of  a  random  variable;  Var(-)  denotes  the 
variance;  Std(-)  denotes  the  standard  deviation. 

A.  One-hop  Sensor  Localization  Using  RSS  Measurements 

In  a  two-dimensional  plane,  consider  a  single  sensor  (or 
source,  target)  located  at  the  origin  and  N  distance  (or  angle) 
measurements  made  to  this  sensor  at  N  known  locations, 
as  illustrated  in  Figure  1.  Here,  the  N  known  locations  are 
abstracted  as  anchors  and  are  labeled  1,  -  •  •  ,N  with  the  i- 
th  anchor’s  location  denoted  by  Sj  =  [xi,yi]T.  The  true 
distance  between  the  sensor  and  the  i-th  anchor  is  denoted 
by  di  =  ||  Si  || .  The  true  angle  subtended  at  the  sensor  by  the 
i-th  anchor  and  the  positive  x-axis  is  denoted  di. 

For  a  specific  localization  problem,  the  precise  locations  of 
the  N  anchors,  i.e.  [xj,2/ilT,  are  given  in  advance;  pair-wise 
distance  measurements  {di,  i  =  1,  ■  ■  ■  ,  N}  between  the  sen¬ 
sor  and  the  anchors  are  made  and  obey  certain  error  models. 
Then,  the  aim  of  single-hop  sensor  localization  is  finding  an 
estimate  of  the  true  sensor  position  using  the  observable  set 
of  distance  measurements  {di,  i  =  1,  •  •  •  ,  N}.  In  this  paper, 
we  consider  the  performance  limit  of  sensor  localization  over 
a  family  of  random  anchor  locations  other  than  a  specific 
localization  problem  with  given  anchor  locations. 

Let  the  sensor  be  a  transmitter  and  the  N  anchors  be 
receivers.  Define  {Pi,i  =  l,--  -  ,N}  to  be  the  measured 
received  signal  powers  at  the  N  anchors  transmitted  by  the 
sensor.  We  make  the  following  assumptions: 

Assumption  1:  The  wireless  channel  satisfies  the  log¬ 
normal  (shadowing)  model  and  the  received  powers  {Pi,  i  = 
1,  •  •  •  ,N}  at  the  N  anchors  are  statistically  independent. 


Fig.  1.  Localizing  a  sensor  using  N  anchors. 


Remark  1:  Assumption  1  is  the  basis  for  converting  the 
RSS  measurements  (i.e.  received  powers)  to  distance  esti¬ 
mates  [5],  and  is  commonly  made  in  studies  on  RSS-based 
sensor  localization  (e.g.  [1],  [6]).  It  follows  that  Pi(dBm)  = 
10  log10  Pi  are  Gaussian 

Pi  (dBm)  =  Lb  (dBm)  -  10alog10  ■§-  +  Z,  (1) 

Kq 

where  Po(dBm)  is  the  mean  received  power  in  dBm  at  a 
reference  distance  Rq,  a  is  the  path-loss  exponent,  and  Z  is  a 
random  variable  representing  the  shadowing  effect,  normally 
distributed  with  mean  zero  and  variance  <t^b  0n  dBm). 
As  pointed  out  in  [7],  due  to  the  fact  that  the  log-normal 
model  does  not  hold  for  di  =  0,  the  close-in  distance  Rq 
is  introduced  as  the  known  received  power  reference  point, 
and  is  virtually  the  lower  bound  on  practical  distances  used 
in  the  wireless  communication  system.  Further,  Po(dBm)  is 
computed  from  the  free  space  path  loss  formula  (see,  e.g. 

[7]). 

B.  A  Random  Sensor-Anchor  Geometry  Model 

Assumption  2:  The  N  anchors  are  randomly  and  uni¬ 
formly  distributed  inside  the  annulus  centered  at  the  sensor 
and  defined  by  radii  Rq  and  R  (R>  Rq  >  0). 

Remark  2:  In  Assumption  2,  R  is  the  upper  bound  on 
practical  distances  which  is  normally  restricted  by  the  factors 
determining  path  loss  attenuations;  Rq,  though  representing 
the  lower  bound,  is  mainly  devised  to  avoid  the  incon¬ 
venience  in  calculations,  and  theoretically  speaking,  any 
arbitrarily  small  positive  number  can  be  the  lower  bound. 
By  Assumption  2,  each  possible  sensor-anchor  geometry  is 
as  probable  as  another,  in  the  sense  that  the  sensor-anchor 
geometry  follows  a  “uniform”  distribution.  Furthermore,  it  is 
easy  to  show  that  {di,  i  =  1,  •  ■  ■  ,  N}  and  {0,,  i  =  1,  ■  •  ■  ,N} 
are  mutually  independent. 

C.  The  Scalar  Metric 

The  probability  density  function  (pdf)  of  //  can  be  for¬ 
mulated  as  follows 

Mf‘)=(inio)X^p,exp{— Klnl)  }■ <2) 

where  b  =  (gJ^10)  and  dt  =  d0  "  . 
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For  the  purpose  of  computing  the  CRLB  for  sensor  lo¬ 
calization  using  the  RSS  measurements,  we  formulate  the 
Fisher  information  matrix  (F1M)  Frss  as 


Frss  =  b 


EN  cos2  8j  y'iV  cos  fl,-  sin  6,; 

i=l  d?  2^ii=l  cP 

EN  cos  Oj  sin  6j  sin^  6j 

i=l  d? 


(3) 


A  detailed  derivation  can  be  found  in  [1],  If  Frss  is  non¬ 
singular,  the  CRLB,  denoted  Crss,  is  just  the  inverse  of 
Frss ■  Then,  we  define  Tt^Crss)  to  be  a  metric  for  the 
performance  limit  of  localizing  the  sensor  and  have 


Tr(CRSS) 


2^i= i 


sin  2(0i—dj) 

i<i<j<tv  dTcP. 


(4) 


Since  {di,i  =  l,---  ,N}  and  {Oi,i  =  l,---  ,N }  are 
random  variables,  Tt{Crss  )  is  obviously  a  random  variable. 


D.  U -statistics 

[/-statistics  are  very  natural  in  statistical  work,  particularly 
in  the  context  of  independent  and  identically  distributed 
(i.i.d.)  random  variables,  or  more  generally  for  exchangeable 
sequences,  such  as  in  simple  random  sampling  from  a  finite 
population.  The  origins  of  the  [/-statistics  theory  are  trace¬ 
able  to  the  seminal  paper  [4],  which  proved  the  Central  Limit 
Theorems  for  [/-statistics.  Following  the  publication  of  this 
seminal  paper,  the  interest  in  this  class  of  statistics  steadily 
increased,  crystallizing  into  a  well-defined  and  vigorously 
developing  line  of  research  in  probability  theory.  Its  formal 
definition  is  presented  as  follows: 

Definition  1:  Let  {Xi,i  =  l,---  ,N}  be  i.i.d.  p- 

dimensional  random  vectors.  Let  h(x±,  ■  ■  ■  ,xr)  be  a  Borel 
function  on  Rrxjj  for  a  given  positive  integer  r  (<  N )  and 
be  symmetric  in  its  arguments.  A  [/-statistic  Un  is 

Un=  1  m  ’  E  h(Xiir-- ,Xir)  (5) 

l<ii  <--<ir  <N 

and  h(x i,  •  •  •  ,  xr)  is  called  the  kernel  of  Un- 

It  is  obvious  that  Tt{Crss )  involves  the  ratio  of  two 
[/-statistics  according  to  (4),  which  motivates  us  to  study 
Tr{CRss )  through  an  asymptotic  analysis  based  on  the 
theory  of  [/-statistics. 


III.  Main  Results 

Due  to  the  complexity  of  Tr(CRss),  it  is  very  difficult  to 
give  its  accurate  distribution  directly.  As  such,  we  endeavor 
to  present  an  asymptotic  analysis  at  first.  Due  to  the  space 
limit,  proofs  are  omitted. 


A.  Theories 


(2) 

•  {X>  ,  i  =  l,---  ,N}  are  i.i.d.  random  variables  with 
bounded  values; 


•  {X^fi  =  1  ,■■■  ,N}  and  {X\z> ,i  =  1  ,N}  are 

mutually  independent, 

define  vectors  Xi  =  [  x[r>  ]T  (i  =  1,  •  •  •  ,  N)  and 

two  sequences  of  random  variables 


(2) 


N 


TN  = 


Sn  = 


N 


EE\ 


i=l 

2 


E 


N(N-l) 

t<i<j<N 

x  sin2  (E2)  -  E2)) 


Then,  as  N  — >  oo, 

Tn  1 


2  <72 


Sn  miTO2  Nm\m2 


+  Mn  +  Rn 


(6) 


(7) 


(8) 


where  mi  =  E{ X[  ),  01  =  Std(X[1  ),  m2  = 

E(sm2(x[2)  ~  X^)), 


N 


MN  =  RT  Efft  (Xi) 


i= 1 


9i(Xi) 

92(Xi,Xj) 


N(N  —  1) 

mi  -  X}1) 

2m\nri2 

1 


E  92(Xi,Xj),  (9) 


l<i<j<N 


(10) 


x\x)  +  X(1)  2X(1)E1} 

L  J  I  1  J 


m\rri2 


mirri2 

E1}E1}  sin2(E2)  -  X{2)) 

A  9 

777,2 


mfm2 


(ID 


and  Rn  is  the  remainder  term.  For  any  e  >  0,  /[y  satisfies 


Pr{\NRN\  >e}  =  0(/V_1),  (12) 

Pr{\N(lnN)RN\  >  e}  =  o(l),  (13) 

In  Lemma  1,  by  letting  X^'1  =  ^  and  X-2^  =  6i,  we 
have  m2  =  0.5,  and 


m\ 


01 


'±jL' 

R2~R2 


\ 


R2R2 


R2-R2 


(14) 

(15) 


and  our  main  result  is  further  summarized  as  follows. 

Theorem  1:  Let  mi  and  or  be  defined  by  (14)  and  (15). 
Define  a  sequence  of  random  variables 


According  to  (4),  a  key  property  of  Tt^Crss )  is  that  it 
is  the  ratio  of  two  sums  of  random  variables,  which  can  be 
processed  by  using  the  following  lemma. 

Lemma  1:  Given  {X^\i  =  l,--  -  ,N}  and  { x[2\i  = 
1,  •  •  •  ,  N}  where 

•  {X^\i  =  l,---  ,N}  are  i.i.d.  random  variables  with 
bounded  values; 


WN  =  f  '/»(«  - DfonA  Tr(CKss)_y^_ 

y  4cti  j  01  s/Nm  1 

(16) 

Then,  as  N  — >  00,  Wn  converges  in  distribution  to  a 
standard  normal  random  variable. 

Remark  3:  In  view  of  the  linear  relationship  between  Wn 
and  Tr(CRss ),  it  is  clear  that  Tt[Crss )  is  asymptotically 
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normal.  Therefore,  for  a  sufficiently  large  N,  the  distribution 
of  Tr(Cnss)  can  be  approximated  by  the  normal  distribution 


(N  —  l)bmi 


(  4tri 

Nm2J  ’  yy/N{N-l)bml 


(17) 

Most  importantly,  the  above  normal  random  variable  makes 
it  possible  for  us  to  analytically  study  the  performance 
limit,  i.e.  Tr{CRss )■  Firstly,  we  can  obtain  a  comprehensive 
knowledge  about  how  Tr(CRss )  is  statistically  distributed 
and  how  Tr(CRss )  is  affected  by  N.  Secondly,  using  the 
normal  distribution  function  from  (17),  we  can  compute  the 
probability  that  Tr{CRss )  is  below  a  given  threshold  for  a 
known  value  of  N\  in  turn,  we  can  determine  a  threshold 
such  that  Tr{Cnss )  is  below  the  threshold  with  a  certain 
confidence  level,  say  0.99;  in  addition,  we  can  find  the 
minimum  N  such  that  Tr{CRss )  is  below  a  given  threshold 
with  a  certain  confidence  level.  Such  analysis  is  undoubtedly 
helpful  for  the  design  and  deployment  of  sensor  networks. 
Thirdly,  the  moments  of  Tr{CRss)  can  be  approximated  by 
the  corresponding  moments  of  the  normal  variable  defined 
by  (17),  namely. 


E(Tr(CRSS )) 
Std(Tr(CRSS )) 


8  erf 


(N  —  l)bmi 
4(71 


N(N  —  1  )bm 


tw 


(19) 


VN{N  -  1)6  mf  ’ 

which  characterize  the  relationship  among  the  mean  and 
standard  deviation  of  Tr(CRss ),  the  number  of  anchors, 
noise  statistics  of  the  RSS  measurements  and  the  spatial 
distributions  of  the  anchors. 

A  natural  question  arises  as  to  how  large  N  should 
be  to  obtain  a  good  approximation;  this  gives  rise  to  the 
convergence  rate  study.  In  the  literature  of  [/-statistics,  the 
Berry-Esseen  bound  was  developed  for  characterizing  the 
convergence  rates  of  [/-statistics  [8],  [9].  Because  Wn  is 
affine  to  a  [/-statistic  (i.e.  Mjv),  we  propose  the  following 
theorem  describing  the  convergence  rate  of  Wn  in  the  way 
similar  to  the  Berry-Esseen  bound. 

Theorem  2:  Use  the  notations  in  Theorem  1  and  define 

^  ■  (20) 


Then,  as  N  — >  oo, 


(x2  —  l)e  X2 

\  ^  ) 

sup  \Fn(x)  -  $(a;)|  < 


N~z  +  0(N~1)  (21) 


where  Fn{x)  is  the  distribution  function  of  Wn  and  $(x) 
is  the  standard  normal  distribution  function. 

Remark  4:  Theorem  2  shows  that  as  N  — >  oo,  the 
density  of  Wat  converges  to  standard  normality  with  the  rate 
0(N~2).  Additionally,  it  can  be  verified  that  the  coefficient 
associated  with  N~  2  is  a  function  of  the  ratio  that  is 

-/TO 

to  say,  the  convergence  rate  of  the  density  of  Wn  is  not 


(a) 


(b) 


Fig.  2.  The  distribution  functions  and  pdfs  of  Tr(Cnss )  with  Ro  = 
lm,  R  =  10m,  a  =  2.3  and  cr^B  =  3.92. 


determined  by  the  individual  values  of  Rq  and  II,  but  by  the 
ratio 

IV.  Simulations 

In  this  subsection,  we  would  like  to  carry  out  simulations 
to  verify  Theorem  1  and  the  approximations  given  in  (18)  and 
(19).  The  parameters  a,  <JdB  and  Rq  describing  the  wireless 
channel  are  set  as  2.3,  3.92  and  1  m,  respectively,  which  are 
measured  in  a  practical  environment  [1]. 

Firstly,  we  plot  in  Fig.  2(a)  the  actual  distribution  functions 
of  Tr{CRss)  (with  the  legend  “Simulation”)  and  the  normal 
distribution  function  (17)  (with  the  legend  “Formula”)  for 
N  =  5, 10, 15, 20.  As  can  be  seen,  when  N  =  5,  the 
discrepancy  between  them  is  quite  obvious;  when  N  =  10, 
the  discrepancy  becomes  very  small;  when  N  =  15  or  20, 
the  discrepancy  is  negligible.  The  discrepancy  reduces  with 
N  increasing  as  illustrated  in  Fig.  2(a),  and  arises  for  two 
reasons:  the  intrinsic  error  in  approximating  a  [/-statistic 
by  normality,  and  the  existence  of  the  remainder  term  Rn 
which  obeys  Pr  {\Rn\  >  =  0(N~1),  see  (12),  and 

though  nonzero  is  neglected  in  the  calculation.  Furthermore, 
we  plot  the  corresponding  pdfs  in  Fig.  2(b).  It  can  be  seen 
that  the  overall  shapes  of  the  actual  pdfs  (with  the  legend 
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“Simulation”)  are  quite  similar  to  those  of  normality,  and  the 
discrepancy  in  between  reduces  with  N  increasing.  These 
observations  are  consistent  with  and  in  turn  demonstrate 
Theorem  1. 

Secondly,  we  plot  the  means  and  the  standard  deviations 
of  Tr(Cuss)  from  both  simulations  and  the  formulas  (18) 
and  (19)  in  Figs.  3(a),  3(c),  3(b)  and  3(d).  It  is  evident  that 
the  larger  is  N,  the  more  precise  are  the  formulas.  When 
N  =  5,  the  standard  deviation  attains  comparatively  large 
values,  and  the  associated  surface  in  Fig.  3(b)  is  non-smooth; 
the  most  probable  reason  is  that  the  actual  standard  deviation 
is  infinite  for  a  N  as  small  as  5. 

For  better  comparison,  we  define  the  relative  error  to  be 
the  ratio  of  the  difference  between  the  quantity  from  the 
simulations  and  that  from  the  corresponding  formula  to  the 
former  one,  and  plot  them  in  Fig.  3(e)  and  3(f).  It  can  be 
seen  that:  (i)  the  mean  is  underestimated  by  (18)  when  R  is 
small,  say  R  =  2m,  but  is  overestimated  by  (18)  when  R  is 
large,  say  R  =  10m,  while  the  associated  absolute  value  of 
the  relative  error  decreases  with  N  increasing  in  most  cases; 
(ii)  the  standard  deviation  of  Tr(Cnss )  is  underestimated 
by  (19)  and  the  associated  absolute  value  of  the  relative 
error  decreases  with  increasing  N  and  R;  (iii)  suppose  the 
absolute  value  of  the  relative  error  below  10%  is  acceptable: 
when  R  =  2m,  (18)  is  applicable  if  N  >  6,  but  (19)  is  not 
applicable  even  if  N  =  20;  when  R  =  10m,  both  (18)  and 
(19)  are  applicable  if  N  >  11. 

In  what  follows,  we  present  some  useful  remarks  on  the 
properties  of  sensor  localization  provided  that  (18)  and  (19) 
are  applicable.  It  is  notable  that  in  (18)  and  (19),  the  mean 
and  standard  deviation  of  Tr(Cnss)  normalized  by  R2  (or 
Rq)  are  dependent  upon  the  ratio  jj-',  hence,  we  simplify 
the  discussion  involving  Rq  and  R  by  letting  Rt,  =  1  m  and 
only  concentrating  on  R. 

Remark  5:  Equation  (18)  quantitatively  characterizes  the 
average  performance  limit  over  all  possible  sensor-anchor 
geometries  and  is  indicative  for  evaluating  the  average  local¬ 
ization  performance  over  a  period  of  time  and/or  in  a  wide 
region.  In  addition,  because  the  mean  is  in  inverse  proportion 
to  N,  a  critical  value  N*  differing  from  the  parameters 
Rq,  R.  (JdB  and  a  can  be  determined,  such  that  having  more 
anchors  than  N*  contributes  little  to  the  quality  of  sensor 
localization. 

Remark  6:  It  can  be  easily  deduced  that  both  (18)  and 
(19)  monotonically  decrease  with  R  decreasing,  as  illustrated 
in  Fig.  3(c)  and  3(d);  the  reason  is  that  long  distance  mea¬ 
surements  from  RSS  suffer  greater  errors,  and  thus  produce 
worse  localization  performance.  Therefore,  given  a  fixed  N, 
distance  measurements  from  a  sensor  are  better  made  at 
locations  as  close  to  the  sensor  as  possible.  Moreover,  it  turns 
out  that  using  more  distance  measurements  spread  over  a 
wide  range  is  not  necessarily  better  than  using  fewer  distance 
measurements  but  spread  in  a  narrow  range  in  terms  of  the 
average  performance  limit.  For  instance,  E{Tr{Cnss ))  is 
approximately  0.52431  m2  given  N  =  15  and  R  =  6  m, 
while  a  smaller  mean  which  is  approximately  0.43174  m2 
can  be  achieved  given  N  =  10  and  R  =  4  m.  Thus,  tradeoff 


should  be  made  between  the  number  of  anchors  (i.e.  N)  and 
their  spreading  (i.e.  i?o  and  R)  in  sensor  localization. 

Remark  7:  Though  we  discuss  the  impacts  of  N  and  R 
separately,  the  variables  are  correlated  in  some  situations, 
and  so  the  impacts  are  related.  Normally,  increasing  all  the 
transmission  powers  in  a  wireless  sensor  network  enlarges 
the  communication  coverage  of  every  node,  and  both  N  and 
R  for  localizing  one  sensor  tend  to  rise,  but  Tr{Cnss)  and 
its  mean  will  definitely  decrease  according  to  [1]. 

Remark  8:  The  dispersion  of  Tr(Cnss )  reflects  its  sensi¬ 
tivity  to  sensor-anchor  geometries.  Specifically,  with  a  large 
dispersion,  the  chance  of  having  two  different  sensor-anchor 
geometries  to  lead  to  a  big  difference  in  the  resulting  values 
of  Tr(Cnss )  is  large,  implying  a  large  sensitivity,  and 
we  should  be  careful  about  sensor-anchor  geometries;  by 
contrast,  with  a  small  dispersion,  the  chance  is  certainly 
small,  so  is  the  sensitivity,  and  there  is  less  reason  to 
worry  about  sensor-anchor  geometries  even  if  the  anchors  are 
randomly  deployed.  Given  a  random  variable,  the  coefficient 
of  variation,  defined  to  be  the  ratio  of  its  standard  deviation 
to  its  mean,  is  a  normalized  measure  of  dispersion  of 
its  distribution.  Therefore,  the  coefficient  associated  with 
Tr(Cnss )  has  the  order  of  0(N~?)  and  the  less  is  the 
coefficient,  the  smaller  is  the  sensitivity.  In  particular,  if  the 
coefficient  equals  its  minimum,  i.e.  0,  all  the  sensor-anchor 
geometries  will  result  in  one  unique  value  of  Tt(Crss ), 
so  that  the  minimum  sensitivity  is  attained.  Alternatively, 
we  can  observe  the  sensitivity  from  Fig.  2(b):  the  range  of 
Tr(Cfiss)  with  a  non-trivial  probability  becomes  narrower 
and  narrower  with  N  increasing,  implying  that  the  sensitivity 
is  reducing. 

V.  Conclusion  and  Future  Work 

In  this  paper,  we  investigated  the  performance  limit  of 
single-hop  sensor  localization  with  the  RSS  measurements 
by  statistically  sensor-anchor  geometry  modeling.  That  is, 
the  positions  of  anchors  are  assumed  to  be  random  and 
the  statistical  attributes  of  the  trace  of  the  CRLB  matrix 
embodies  essential  features  of  sensor  localization.  With  strict 
mathematical  proofs,  we  showed  that  the  trace  of  the  CRLB 
matrix  is  asymptotically  normal.  Based  on  this  study,  we 
analyzed  the  features  of  sensor  localization  and  carried  out 
extensive  simulations. 

In  future  work,  we  would  like  to  take  into  account  other 
distributions  of  anchor  positions  other  than  the  uniform 
distributions,  as  well  as  considering  other  types  of  measuring 
techniques,  including  Time  of  Arrival  (TOA),  Time  Differ¬ 
ence  of  Arrival  (TDOA),  etc.  In  addition,  it  is  more  attractive, 
but  of  course  extremely  difficult,  to  conduct  similar  studies 
for  multi-hop  sensor  localization. 
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One-dimensional  sensor  networks  can  be  found  in  many  fields  and  demand  node 
location  information  for  various  applications.  Developing  localization  algorithms  in 
one-dimensional  sensor  networks  is  trivial,  due  to  the  fact  that  existing  localization 
algorithms  developed  for  two-  and  three-dimensional  sensor  networks  are  applicable; 
nevertheless,  analyzing  the  corresponding  localization  errors  is  non-trivial  at  all, 
because  it  is  helpful  to  improving  localization  accuracy  and  designing  sensor  network 
applications.  This  paper  deals  with  localization  errors  in  distance-based  multi-hop 
localization  procedures  of  one-dimensional  sensor  networks  through  the  Cramer-Rao 
lower  bound  (CRLB).  We  analyze  the  fundamental  behaviors  of  localization  errors  and 
show  that  the  localization  error  for  a  sensor  is  locally  determined  by  network  elements 
within  a  certain  range  of  this  sensor.  Moreover,  we  break  down  the  analysis  of 
localization  errors  in  a  large-scale  sensor  network  into  the  analysis  in  small-scale 
sensor  networks,  termed  unit  networks,  in  which  tight  upper  and  lower  bounds  on  the 
CRLB  can  be  established.  Finally,  we  investigate  two  practical  issues:  the  applicability  of 
the  analysis  based  on  the  CRLB  and  the  optimal  anchor  placement. 

©  201 1  Elsevier  B.V.  All  rights  reserved. 


1.  Introduction 

One-dimensional  sensor  networks  can  be  found  in 
many  practical  scenarios,  such  as  high-voltage  power 
lines  [1-3],  gas  (water,  oil)  pipelines  [4-7],  mining  tun¬ 
nels  [8],  vehicular  networks  along  a  highway  [9],  bushfire 
detection  [10]  and  so  on.  Regardless  the  dimensions  of 
sensor  networks,  knowledge  of  node  location  is  often  used 
to  report  the  geographic  origin  of  events,  to  assist  in 
target  tracking,  to  achieve  geographic  aware  routing,  to 
manage  sensor  networks,  to  evaluate  their  coverage,  and 
so  on.  Node  localization  is  often  posed  as  an  optimization 
problem  to  estimate  a  most  probable  location  among  a  set 
of  possibilities  [11-14],  because  distance  and  angle  mea¬ 
surements  suffer  environmental  and  intrinsic  hardware 
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noises  in  reality  and  are  then  imprecise.  Most  of  the 
existing  localization  algorithms,  though  developed  for 
two-  and  three-dimensional  sensor  networks,  can  be 
directly  applied  in  one-dimensional  cases. 

Localization  algorithms  usually  assume  the  existence 
of  anchors,  i.e.  nodes  whose  locations  are  known  a  priori 
or  which  are  equipped  with  extra  devices,  such  as  GPS 
receivers,  through  which  their  locations  can  be  obtained. 
The  remaining  nodes,  termed  sensors,  besides  being  able 
to  sense  an  event  of  interest  typically  can  sense  a  distance 
or  angle  assisting  in  localization.  If  a  sensor  has  direct 
measurements  from  a  sufficient  number  of  anchors  to 
determine  its  location,  its  localization  is  termed  one-hop 
localization;  otherwise,  it  is  termed  multi-hop  localiza¬ 
tion  [15-17].  Localization  using  GPS  is  a  specific  case  of 
one-hop  localization;  given  a  sensor  network  involving  a 
small  fraction  of  nodes  as  anchors,  localization  is  normally 
multi-hop. 

A  challenging  problem  in  relation  to  localization  is 
analyzing  localization  errors  and  has  received  considerable 
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attention.  For  the  one-hop  case,  localization  errors  are 
formulated  by  CDOP  [18,19],  the  Cramer-Rao  lower  bound 
(CRLB),  [15,16,20-22],  Bayesian  bounds  [23]  etc.,  and  have 
been  well  understood;  for  the  multi-hop  case,  on  account  of 
the  complexity  in  the  construction  of  localization  errors, 
only  simulation  studies  and  very  limited  analytical  studies 
have  been  introduced  [24-26], 

A  closely  relevant  problem  is  known  as  “error  propaga¬ 
tion”  [12,15].  Particularly,  consider  a  two-dimensional 
multi-hop  localization  procedure  using  trilateration  [14], 
in  which  a  sensor  is  localizable  only  if  it  measures  at  least 
three  distances  from  non-collinear  anchors  and/or  already 
localized  sensors  (which  perform  as  pseudo-anchors);  if 
pseudo-anchors  are  involved,  the  errors  in  pseudo-anchor 
location  estimates  are  propagated  into  location  estimates 
of  later  localized  sensors.  In  effect,  the  phenomenon  of 
propagating  (i.e.  growing)  localization  errors  is  a  common 
characteristic  in  the  case  of  multi-hop  localization.  In  view 
of  the  increasing  deployment  of  sensor  networks  for 
various  applications,  it  is  evidently  desirable  to  under¬ 
stand  how  localization  errors  grow  among  sensors  given  a 
configuration  of  sensor  network  parameters,  such  as 
sensor  density,  anchor  density,  anchor  placement  strategy, 
level  of  distance  measurement  errors,  etc.  Based  on  the 
study  of  localization  errors,  developers  can  choose  proper 
parameters  to  control  localization  errors;  in  addition, 
location  estimates  provide  context  to  sensed  data  (e.g. 
temperature,  humidity,  etc),  and  the  study  of  localization 
errors  underpins  assessment  of  the  reliability  of  location 
estimates  of  different  sensors,  which  is  helpful  to  further 
data  processing  in  various  sensor  network  applications. 

In  this  paper,  we  attempt  to  analyze  localization  errors 
in  distance-based  multi-hop  localization  procedures  of 
one-dimensional  sensor  networks  through  the  CRLB.  The 
usage  of  the  CRLB  is  because  it  is  algorithm-independent 
and  can  be  attained  by  the  localization  algorithm  using 
the  maximum-likelihood  estimator  (MLE)  in  the  one¬ 
dimensional  case  (as  is  proved  in  this  paper).  By  treating 
distance  measurements  between  nodes  as  connections 
constituting  networks,  we  consider  a  sensor  network  as  a 
general  network.  Because  of  the  complexity  associated 
with  a  general  network,  it  proves  useful  to  consider  by 
way  of  an  intermediate  step  unit  networks,  which  are 
defined  as  follows: 

Definition  1.  A  one-dimensional  connected  network  is  a 
unit  network  if  and  only  if  all  anchors  are  placed  at  the 
left  side  of  all  sensors  and/or  the  right  side  of  all  sensors. 

As  shown  in  Fig.  1,  there  are  no  anchors  between  any 
pair  of  sensors  in  unit  networks.  In  the  light  of  this 


a 


b 



Fig.  1.  Examples  of  unit  networks.  In  (a),  anchors  are  placed  in  the 
leftmost  side;  in  (b),  anchors  are  placed  in  both  the  leftmost  and  the 
rightmost  sides. 


feature,  a  divide  and  conquer  method  is  developed  in 
the  sense  that  by  building  on  results  for  unit  networks, 
conclusions  regarding  general  networks  can  be  drawn,  as 
will  be  demonstrated. 

The  major  contributions  of  this  paper  include:  proofs 
for  the  fundamental  behavior  of  localization  errors  in  one¬ 
dimensional  sensor  networks;  a  divide  and  conquer 
method  for  analyzing  localization  errors  in  large-scale 
one-dimensional  sensor  networks;  the  behavior  of  locali¬ 
zation  errors  in  small-scale  unit  networks;  the  considera¬ 
tion  of  two  practical  issues  regarding  the  analysis,  viz. 
applicability  of  the  analysis  based  on  the  CRLB  and  the 
optimal  anchor  placement. 

The  remainder  of  this  paper  is  organized  as  follows. 
The  next  section  reviews  the  literature.  Section  3  provides 
the  formulation  of  the  CRLB  in  one-dimensional  networks 
and  presents  certain  fundamental  behaviors  of  localiza¬ 
tion  errors.  Section  4  proposes  a  divide  and  conquer 
method  to  analyze  localization  errors  in  large-scale  net¬ 
works  based  on  unit  networks.  Section  5  further  discusses 
two  practical  issues.  Finally,  Section  6  concludes  the  paper 
and  sheds  light  on  future  work. 

2.  Related  work 

In  this  section,  we  first  review  the  algorithm-indepen- 
dent  analysis  of  localization  errors  and  then  introduce 
some  analysis  focusing  on  error  propagation. 

In  [15],  Savvides  et  al.  investigated  the  different 
aspects  of  the  error  of  location  estimates  in  distance- 
based  multi-hop  localization  procedures.  They  formulated 
the  CRLBs  for  sensors  in  a  network  under  the  Gaussian 
model  which  assumes  that  the  errors  in  distance  mea¬ 
surements  are  i.i.d.  (independent  and  identically  distrib¬ 
uted)  Gaussian  with  mean  zero,  and  computed  the  root 
mean  squared  error  (RMSE)  as  a  metric  for  the  localiza¬ 
tion  accuracy  of  this  network  using  the  derived  CRLBs. 
Then,  simulation  was  carried  out  to  study  the  impacts  of 
various  network  parameters,  including  sensor  and  anchor 
densities,  level  of  distance  measurement  errors  and  net¬ 
work  size,  on  the  RMSE.  Additionally,  a  similar  study  was 
reported  for  networks  with  distance  and/or  angle  mea¬ 
surements  in  [16]. 

In  [20],  Chang  et  al.  also  dealt  with  distance-based 
multi-hop  localization  procedures  under  the  Gaussian 
model,  and  provided  a  geometric  interpretation  of  the 
CRLB  in  the  sense  that  the  CRLB  is  essentially  invariant 
under  zooming,  translation,  and  rotation.  Moreover,  they 
derived  lower  and  upper  bounds  on  the  CRLB  in  two- 
dimensional  networks  which  are  similar  to  the  bounds  we 
will  establish  in  Section  4.  Apart  from  this,  they  discussed 
at  length  the  error  behaviors  of  a  special  kind  of  localiza¬ 
tion  procedures  in  which  no  anchors  are  involved  and 
only  relative  location  can  be  obtained. 

In  [23],  Wang  et  al.  focused  on  the  error  behaviors  in 
one-hop  localization  procedure.  Given  anchor  locations 
and  errors  in  distance  measurements  from  a  sensor  to  the 
anchors,  their  method  computes  the  minimum-entropy 
location  distribution  of  this  sensor.  They  defined  the 
Bayesian  bound  as  the  lower  bound  on  the  covariance  of 
such  distribution,  which  was  compared  with  the  CRLB 
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through  simulations.  Particularly,  under  the  Gaussian 
model,  the  Bayesian  bound  equals  the  CRLB. 

Formulations  of  the  CRLB  for  localization  problems 
were  also  presented  in  [21,22]:  Patwari  et  al.  derived  the 
CRLB  under  Gaussian  and  log-normal  models  respectively 
in  [21];  Larsson  included  clock  biases  existing  in  TOA 
(time  of  arrival)  measurement  systems  as  unknown 
parameters  in  the  formulation  of  the  CRLB  in  [22]. 

Besides  these  studies  on  the  general  behavior  of 
localization  errors,  very  limited  results  were  also  derived 
for  error  propagation.  Niculescu  et  al.  in  [24]  provided  a 
closed-form  formula  describing  error  propagation  in 
terms  of  the  fraction  of  anchors,  node  density  and  the 
level  of  measurement  errors.  Their  result  is  not  valid  for 
purely  distance-based  localization  algorithms  but  for  a 
particular  localization  algorithm,  termed  DV-Position, 
which  requires  both  distance  and  AOA  (angle  of  arrival) 
measurements. 

In  [25],  Shi  et  al.  studied  error  propagation  by  per¬ 
forming  a  theoretical  analysis  for  one-dimensional  net¬ 
works.  Two  major  conclusions  were  reported:  the 
localization  accuracy  improves  by  adding  more  sensors 
into  the  network;  for  multi-hop  localization  procedures, 
the  localization  accuracy  of  a  sensor  is  mainly  dependent 
upon  the  number  of  hops  it  is  away  from  anchors,  so  that 
the  further  away  from  anchors  is  a  sensor,  the  worse  is  its 
localization  accuracy.  For  two-  and  three-dimensional 
cases,  these  conclusions  were  extended  by  simulations 
and  experiments.  Nevertheless,  their  theoretical  treat¬ 
ments  are  not  adequate  because  they  are  restricted  to 
highly  simple  one-dimensional  networks. 

In  addition,  [26]  formally  analyzed  the  algorithmic 
complexity  of  localization  using  interferometric  ranging 
and  proposed  an  iterative  algorithm  that  gradually  deter¬ 
mines  sensor  locations  hop  by  hop  in  a  multi-hop  locali¬ 
zation  procedure.  Given  sensors  randomly  distributed 
within  a  square  and  four  anchors  deployed  to  the  center 
of  the  square,  simulations  based  on  the  iterative  algo¬ 
rithm  indicated  a  linear  increase  of  the  localization  errors 
(measured  by  the  average  distance  between  true  locations 
and  the  corresponding  estimated  locations)  with  the  hop 
count1  from  a  sensor  to  anchors,  but  theoretical  explana¬ 
tion  was  not  provided. 

In  summary,  theoretically  analyzing  localization  errors 
is  still  a  challenging  problem  in  the  literature,  and 
through  this  effort,  we  hope  to  present  a  theoretical 
understanding  on  how  localization  errors  behave  during 
the  multi-hop  localization  procedures  of  one-dimensional 
networks. 


the  variance;  (-)-1  denotes  the  matrix  inverse;  x  denotes  a 
vector;  X„  denotes  a  square  matrix  of  order  n;  (X)y  or  (X),j 
denotes  the  entry  in  the  i-th  row  and  j-th  column  of 
matrix  X;  (X);  denotes  the  i-th  row  of  matrix  X. 

3.1.  The  problem  model  of  one-dimensional  networks 

Define  a  one-dimensional  network  N  to  be  a  triple 
( A,S,M ),  where  A  denotes  the  anchor  set,  S  the  sensor  set 
and  M  the  distance  measurement  set.  Assume  that 

•  all  nodes,  including  both  anchors  and  sensors,  are 
deployed  along  a  straight  line; 

•  two  nodes  can  measure  their  distance  if  and  only  if  it  is 
less  than  1; 

•  anchor  locations  are  precisely  known  and  sensor  loca¬ 
tions  are  unknown; 

•  errors  arise  in  distance  measurements  which  are 
independent  Gaussian  with  mean  zero  and  certain 
variances  differing  from  the  lengths  of  actual 
distances. 

Construct  an  undirected  graph  G  =  (V,E)  for  A f,  where 
V  =  AuS  and  E=M.  If  G  is  connected,  A f  is  connected; 
otherwise,  A f  is  in  fact  a  group  of  networks  with  con¬ 
nected  graphs  and  can  be  studied  separately.  Hence,  we 
assume  that  in  this  paper  G  is  connected,  and  say  A f  is 
connected  for  simplicity. 

3.2.  CRLB 

Given  a  one-dimensional  network  A f  =  (A, S,M)  con¬ 
forming  to  the  problem  model,  we  need  to  formulate 
the  Fisher  information  matrix  (FIM)  for  A f  to  compute  its 
CRLB.  Define  the  following  notations: 

•  n  =  |S|  and  sensors  are  labeled  as  1, . . .  ,n; 

•  m=\A\  and  anchors  are  labeled  as  n+l,n  +  2,..., 
n+m; 

•  the  true  location  of  i  (1  <  i  <  n+m)  is  x,; 

•  the  true  and  noisy  distance  measurements  between  i 
and  j  are  dy  and  dy  (i  <  j); 

•  py  is  the  probability  density  function  of  dy ; 

•  the  error  in  dy  is  ey  =  dy—dy  and  its  standard  deviation 
is  ffy  (  >  0); 

•  Jn  is  an  n  x  n  square  matrix  and  denotes  the  FIM  for  A f\ 

•  0  denotes  an  empty  set. 


3.  Fundamental  behaviors  of  localization  errors 

In  this  section,  after  formulating  the  CRLB  in  one¬ 
dimensional  networks,  we  shall  discuss  in  detail  several 
factors  affecting  the  CRLB  as  well  as  localization  errors. 
We  use  the  following  mathematical  notations  throughout 
this  paper:  £(•)  denotes  the  expected  value;  Var(-)  denotes 


In  terms  of  estimation  terminologies,  suppose 
X  =  {x, , . . .  ,x„)  is  the  set  of  parameters  to  be  estimated 
and  M  is  the  set  of  observations.  Because  of  the  indepen¬ 
dent  Gaussian  errors,  the  logarithmic  likelihood  function  is 

ln/(M;X)=  ]Tlnpy,  (1) 

dy  eM 


1  Throughout  this  paper,  “hop  count”  between  two  nodes  refers  to 
the  number  of  hops  along  their  shortest  path. 


1 

P,J~  V2noy 


(dy- |Xi- Xj|)2 

H 


(2) 
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where  z,j=  1, . . .  ,n+m.  We  can  obtain 

(Jn)‘j  =  e(JL  Inf  (M;  X)  A  ln/(M;  X))  (3 ) 


where  ij  =  1, . . .  ,n  and  then 


(/n)if  —  < 


—  (T2  —  er2’ 
diksM  ik  dki  eM  ki 


i  =h 

dy  eM, 

djt  e  M, 
otherwise. 


Obviously,  Jn  is  symmetric.  Furthermore,  if  Jn  is  non¬ 
singular,  we  define 


Cn=Jn\ 


(4) 


the  diagonal  entries  of  which  are  the  CRLBs  on  the 
variances  of  localization  errors.  Theorem  1  provides  a 
sufficient  and  necessary  condition  for  the  existence  of  Cn. 


Theorem  1.  Given  a  connected  network  JV  =  (A, S,M),  its 
FIM  Jn  is  positive-definite  if  and  only  ifAy=0. 


indirectly  connected  to  an  anchor  through  other  sensors 
and  thus  the  associated  entry  of  y  must  be  zero,  and  hence 
Jn  is  positive-definite.  We  conclude  that  when  JV  is 
connected,  Jn  is  positive-definite  if  and  only  if  A^0.  □ 

3.3.  Factors  affecting  the  CRLB 

In  the  problem  model,  it  is  evident  that  distance 
measurement  errors  are  the  only  source  of  localization 
errors.  Apart  from  that,  network  topology  related  factors 
such  as  node  degrees,  connectivity,  hop  counts  to  anchors, 
etc.,  also  affect  the  magnitude  of  localization  errors.  In 
this  subsection,  we  shall  discuss  the  effects  of  these 
factors  on  the  CRLB  by  examining  some  primitive  network 
modifications: 

•  adding  an  additional  distance  measurement  between 
two  existing  sensors; 

•  adding  an  additional  sensor  and  an  additional  distance 
measurement  between  itself  and  another  sensor; 

•  adding  an  additional  anchor  and  an  additional  distance 
measurement  between  itself  and  a  sensor; 

•  replacing  a  sensor  by  an  additional  anchor. 


Proof.  Define  y  =  [yi , . . .  ,yn]T  to  be  a  column  vector  with 
n  real  entries.  Then 


_£ 

dik  eM  a  ke A 


(5) 


From  the  network  deployment  point  of  view,  the  first 
two  modifications  correspond  to  increasing  sensor  den¬ 
sity  by  adding  more  sensors  or  enlarging  the  range  of 
distance  measurements,  and  the  last  two  modifications 
correspond  to  increasing  anchor  density.  In  what  follows, 
Theorems  2-5  will  describe  their  effects. 


Obviously,  J„  is  positive-semidefinite,  and  thus  we  only 
need  to  prove  that  J„  is  positive-definite,  which  is  equiva¬ 
lent  to  the  statement  that  Jn  is  nonsingular,  or  that 
jq  =  •  •  •  =yn  =  0  is  the  unique  solution  to 


—  Gn 

da  eM  V 


(6) 


Due  to  the  non-negativity  of  the  two  terms,  we  obtain 


_£  -X(yi-yj)2  =  °, 

dy  eM  Gij 


£ 
i  =  1 


_£ 

KdikeM  AkeA 


=  o, 


(7a,  b) 


Each  sensor  is  associated  with  an  entry  of  y.  From  (7a), 
we  know  if  two  sensors  are  directly  connected,  the 
associated  entries  of  y  are  equal;  given  two  sensors,  if  a 
path  consisting  of  only  sensors  exists  between  them,  the 
associated  entries  of  y  are  also  equal;  from  (7b),  we  know 
if  a  sensor  is  directly  connected  to  an  anchor,  its  asso¬ 
ciated  entry  of  y  must  be  zero.  Now  if  A  =  0,  because  JV  is 
connected,  paths  between  any  two  sensors  only  consist  of 
sensors  and  all  entries  of  y  are  equal  but  their  value  is 
arbitrary,  and  hence  Jn  is  only  positive-semidefinite;  on 
the  other  hand,  if  Aa0,  because  JV  is  connected,  each 
sensor  is  either  directly  connected  to  an  anchor  or 


Theorem  2.  Suppose  JV  =  (A,S,M)  (Aa0)  is  a  connected 
network  and  its  FIM  is  Jn.  If  there  is  no  distance  measurement 
between  sensor  i  and  j  ( i<j )  in  JV,  let  M  =  Mu  {dy}. 
Construct  a  new  network  JV  =  ( A,S,M ).  Let  ]n  be  the  FIM  of 
JV  and  Cn  and  C„  be  the  inverses  of  Jn  and]„.  Then,  (I)  for 
any  sensor  s  (1  <  s  <  n),  (Cn)ss  <  (C„)ss;  (2)  Tr(Cn)  <  Tr(C„). 

Proof.  Suppose  y  is  a  column  vector  of  order  n  with  all 
entries  0  except  for  the  i-th  one  being  1  and  the  j-th  one 
being  —  1.  Then,  we  have 

7„=/n+^2yyT.  (8) 


Using  the  method  of  inverting  perturbed  matrices,  see 
e.g.  [27],  we  obtain 


((Cn)i—(Cn  )j  )T«Cn  )j—(Cn  )j ) 
afr2(Cn)y  +  (Cn  ),1  +  (Cn  )jj 


(Cn)ss  ~  (Cn)ss~ 


((Cn)is~(Cn)js7 

tty—2(Cn)ij+(Cn)ii+(Cn)jj 


00) 


Since  Cn  is  positive-definite,  |(C„  )ij\  <  ^(CnUCnljj.  This 

together  with  (Cn)H+ (Cn)jj  >  2  (Cn)a(Cn)^  implies  that 

— 2(C„),j + (Cn)ii + (C„)jj  >  0  and  thus  (Cn)ss  <  (C„)ss. 

Moreover,  define  Ak(Jn)  (k  =  1, . . .  ,n)  to  be  the  eigenvalues 
of  Jn  which  are  listed  in  decreasing  order.  Because  A]  (yyT)  =  2 
and  A2(yyT)  =  •  •  •  =  A„(yyT)  =  0,  yy'  is  positive-semidefinite. 
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Because  Jn  is  positive-definite,  it  is  known  [28]  that 

0  <  >*(/„)  <  4(/„  ).  (11) 

Due  to  Tr(yyT)  >  0,  Tr(/„)  >  Tr (Jn)  and  hence  YJk  =  1  4(/„)  > 

=  \  ^k(Jn)\  together  with  (11),  it  can  be  concluded  that 
there  exists  Is  l,...,n  such  that  A((/n)  >  X  i(Jn)\  accordingly, 
there  exists  7el,...,n  such  that  Xj (J„1)<  ^(/n1)-  which 
implies  that  Tr(C„)  <  Tr(C„).  □ 

Remark  1.  Theorem  2  indicates  that  adding  an  additional 
distance  measurement  between  two  existing  sensors 
either  reduces  the  CRLB  for  any  sensor  or  keeps  it 
unchanged,  while  the  total  CRLB  in  the  network  is 
definitely  reduced.  Specifically,  in  light  of  (10),  the  CRLB 
for  sensor  s  is  unchanged  if  and  only  if  (Cn)is  =  (Cn)JS, 
namely  that  the  correlation  between  location  estimates 
of  sensors  i,  s  equals  that  associated  with  sensors  j,s.  As 
shown  in  Fig.  2(b),  it  is  obvious  that  the  CRLBs  for  sensors 
2,4,5  do  not  change  by  adding  the  distance  measurement 
between  sensors  1  and  3.  From  the  network  topology 
point  of  view,  a  larger  node  degree,  i.e.  more  distance 
measurements  or  a  higher  sensor  density,  corresponds  to 
smaller  CRLBs. 

Remark  2.  As  a  rule  of  thumb,  for  sensors  i  and  j,  the 
smaller  is  the  difference  between  the  hop  counts  from 
them  to  sensor  s,  usually  the  less  is  the  difference  of 
correlations,  i.e.  |(Cn)is-(C„)JS[,  and  thus  the  less  is 
(C„)SS-(C„)SS  based  on  (10).  Therefore,  a  distance  measure¬ 
ment  between  two  sensors  with  a  large  hop  count  results 
in  a  greater  reduction  in  localization  errors  than  that  with 
a  small  hop  count. 

Next,  we  consider  the  introduction  of  an  additional 
sensor. 

Theorem  3.  Suppose  Af  =  (A,S,M)  (A^0)  is  a  connected 
network  with  the  FIM  Jn  and  an  additional  sensor  is  labeled 


a 


1  2  3  4  5 


O  sensor 
O  anchor 


b 

1  2  3  4  5 

1  2  3  4  5  6 


as  n  +  m  +  1.  Let  S  =  Su  {n  +  m+1}  and  M  =  M  U  {d;in+m+i } 
(1  <i<n).  Construct  a  new  network  Af  =  (A, S,M).  Let]n+1 
be  the  FIM  of  Af  and  Cn  and  Cn+1  be  the  inverses  of  Jn  and 
jn+1.  Then,  (1)  the  leading  nxn  submatrix  of  Cn+J  is 
identical  to  Cn;  (2)  (Cn+1)n+hn  +  1  =(C„)ii  +  afn+m+v 


Proof.  Suppose  y  is  a  column  vector  of  order  n  with  all 
entries  0  except  for  the  i-th  one  being  1,  and  according  to 
the  calculations  of  the  FIM,  we  can  obtain 


( Jn  +  - 


Jn  +  1  — 


;yy  -or 

1 


-yTy 


y\ 

/ 


(12) 


By  using  block  matrix  inversion,  we  obtain 

■/n+1  =  (---  yVy+^n+m+i)’ 


03) 


Cn  +  1“('--  (Cn)«  +  ^n  +  m  +  i)’ 

which  proves  the  theorem.  □ 


(14) 


Remark  3.  Theorem  3  shows  that  an  additional  sensor 
and  an  associated  distance  measurement  do  not  change 
the  CRLBs  in  the  original  network,  and  the  CRLB  for  the 
additional  sensor  equals  the  sum  of  the  variance  in  the 
distance  measurement  and  the  CRLB  for  the  sensor  that 
the  distance  measurement  is  from. 


Then,  we  consider  an  additional  distance  measure¬ 
ment  from  an  additional  anchor. 


Theorem  4.  Suppose  Af  =  (A, S,M)  (A+0)  is  a  connected 
network  with  the  FIM  Jn  and  an  additional  anchor  is  labeled 
as  n+m  +  1.  Let  A  =  Au  {n+m  +  1}  and  M  =  M  u  {din+m+1 } 
(1  <  i  <  n).  Construct  a  new  network  Af  =  (A,S,M).  Let]n  be 
the  FIM  of  Af  and  Cn  and  C„  be  the  inverses  of Jn  and  ]n.  Then, 
(1)  for  every  sensor  s  (l<s<n),  (Cn)ss  >  (C„)ss;  (2) 
Tr(Cn)<Tr(C„). 

Proof.  Let  y  be  the  same  vector  as  in  the  proof  of 
Theorem  3.  Based  on  the  formulation  of  the  FIM,  we  have 

Jn  =Jn~F  — 2  ~ — yyT-  <15) 

®i,n  +  ni  + 1 


Using  the  Sherman-Morrison-Woodbury  formula,  see  e.g. 
[28],  we  obtain 


(Cny)(Cny)T 

^n+m+1+yTc„y' 


(16) 


d 


1  2  3  4  5 


Fig.  2.  A  network  and  its  variations  based  on  different  primitive 
modifications.  Given  the  original  network  in  (a),  an  additional  distance 
measurement  is  added  between  sensors  1  and  3  in  (b),  an  additional 
anchor  6  and  an  associated  distance  measurement  are  added  in  (c),  and 
the  sensor  5  is  replaced  by  an  anchor  in  (d). 


(Cn)ss  —  (Cn)Ss~ 


(Cn)si _ 

[n  +  m  +  1  +(Cn)ii 


(17) 


Due  to  (Cn)a  >  0,  we  derive  (Cn)^  <  (Cn)^.  By  (15)  and 
following  the  same  argument  as  in  the  proof  of  Theorem  2, 
we  can  show  that  Tr(C„)  <  Tr(C„).  □ 


Remark  4.  Theorem  4  indicates  that  introducing  an 
additional  distance  measurement  from  an  additional 
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anchor  either  reduces  the  CRLB  for  any  sensor  or  keeps  it 
unchanged,  but  reduces  the  total  CRLB  in  the  network.  In 
view  of  (17),  such  modification  does  not  change  the  CRLB 
for  sensor  s  if  and  only  if  (C„)si  =  0,  namely  that  there  is  no 
correlation  between  location  estimates  of  sensors  s,i. 
Fig.  2(c)  illustrates  an  example,  in  which  the  CRLBs  for 
sensors  1-3  are  unchanged  even  if  an  additional  anchor  6 
and  an  associated  distance  measurement  are  introduced. 
Evidently,  these  conclusions  are  still  correct,  though  the 
additional  distance  measurement  is  made  from  an  exist¬ 
ing  anchor  rather  than  from  an  additional  anchor. 

Remark  5.  Provided  that  s=i  in  (17),  we  can  obtain 
(Cn)ii  =  (<7?n+m+1(C„)ii/((7?n+m+1+(Cn)i,.))<^n+m+1,  which 
reveals  that  no  matter  how  large  ( Cn is,  an  additional 
distance  measurement  between  sensor  i  and  an  anchor 
always  makes  ( C„)u  be  less  than  o fn+m+1.  Interpreted  from 
another  point  of  view,  the  positions  of  anchors  have  great 
influence  on  the  CRLBs  in  the  sense  that  a  small  hop  count  of 
a  sensor  away  from  anchors  probably  leads  to  a  small  CRLB 
and  thus  a  small  error. 


first  one.  Actually,  it  does  not  matter  which  one  should  be 
the  first. 

Lastly,  we  consider  replacing  an  existing  sensor  by  an 
additional  anchor. 

Theorem  5.  Suppose  A f  =  (A, S,M)  (A^0)  is  a  connected 
network  with  the  F1M  Jn  and  n  e  S.  Let  A=Au{n)  and 
S  =  S-{n}.  Construct  a  new  network  JV  =  (A, S,M).  Let  Jn_, 
be  the  FIM  of  Af  and  Cn  and  Cn_!  be  the  inverses  of  Jn  and 
Jn_1.  Then,  (1)  for  every  sensor  s  (l<s<n-l), 
(Cn-l)ss  ^  (Cn)ssi  (2)Tr(Cn_i)<  =  1  ( f-n )fefc- 


Proof.  Let  y  be  a  column  vector  consisting  of  the  leading 
n-1  entries  in  (J„)n.  Based  on  the  formulation  of  the  FIM, 
we  have 


Jn  ( 

ln-1 

vVT 

V  \ 

0n)nn  J 

(19) 

c„  = 

f  On- 

i-(Jn)„nyyTr'  ■\ 

(20) 

Remark  6.  Based  on  (17),  we  define 


(^-n)ss~ (Cn)S; 

(Cn)ss 


Psi 


1+- 


t.n  +  m  +  1 

(Cn)  a 


Suppose  the  leading  (n-1)  x  (n-1)  submatrix  of  C„  is 
V~\,  which  is  positive-definite  since  Cn  is  positive-defi- 
(18)  nite.  Then,  we  can  obtain 

Jn-l  =  Vn— l+(/n)nnyyT.  (21) 


where  i,s=  1, . .  .n  and  psi  (|psi|  <  1)  denotes  the  correla¬ 
tion  coefficient  between  i  and  s.  Hence,  rsi  measures  the 
relative  reduction  in  the  CRLB  for  sensor  s.  The  coefficients 
psi  depend  on  network  topologies  and  may  require  a 
lengthy  calculation  for  their  determinations.  Simulations 
however  reveal  psi  reduces  with  the  hop  count  between  i 
and  s  increasing  at  a  nearly  linear  rate.  Thus,  according  to 
(18),  the  effect  of  an  additional  anchor  dies  off  at  a 
quadratic  speed  in  a  direction  moving  away  from  this 
anchor. 


Cn-l=V-^- 


U«  wViyK^W 

1  +(in)„„y'V-J1y 


(Cn-l)ss  —  (l^n-l)ss  " 


((^n-1  >Sy>2 

1  +y'^,y 


Un)  r 


Rewrite  yTV’“J1y  to 
yTvn\y  =  (yT  o)C„(J). 


(22) 


(23) 


(24) 


Remark  7.  Given  a  network  and  an  additional  anchor,  the 
optimal  placement  of  this  anchor  will  maximize 
E"=i((cn)ss-(Cn)ss).  which,  see  (17),  is  equivalent  to 
determining  the  value  of  i.  The  coefficient  |psi|  reaches 
the  maximum  1  when  s=i.  Assume  that  psi  increases 
monotonically  with  s  increasing  from  0  to  i  and  decreases 
monotonically  from  i  to  n.  By  observing  (18),  given  a 
relatively  small  (Cn)if,  l/(l  +  ff?n+r?+1/(Cn)fi)  is  relatively 
small,  and  with  s  around  i,  ( C„)ss  is  also  small  but  p2si  is 
comparatively  large,  which  probably  leads  to  a  small  sum; 
on  the  other  hand,  for  a  relatively  large  (Cn),j,  1/(1  + 
ffjn+m  +  1/(Cn)ii)  is  relatively  large,  and  with  s  around  i, 
both  ( C„)ss  and  pL  are  comparatively  large,  which  prob¬ 
ably  leads  to  a  large  sum.  Thus,  an  additional  anchor 
should  be  put  near  the  sensor  with  the  largest  CRLB. 
Because  sensors  near  anchors  always  have  small  CRLBs, 
additional  anchors  should  not  be  placed  near  these 
sensors  and  hence  anchors  should  not  be  clustered. 
Essentially,  adding  an  additional  anchor  will  probably 
result  in  more  than  one  additional  distance  measurement. 
But  if  we  take  into  account  these  distance  measurements 
one  by  one,  based  on  this  remark,  effects  of  other  distance 
measurements  may  be  trivial  compared  with  that  of  the 


Obviously,  yTV~J.,y  >  0.  This  with  (Jn)nn  >  0  implies 
(Cn_]  )ss  <  (Cn)ss.  By  (21 )  and  following  the  same  argument 
in  Theorem  2,  we  can  prove  the  second  part  of  this 
theorem.  □ 

Remark  8.  This  theorem  reveals  that  replacing  a  sensor 
in  a  network  by  an  anchor  and  keeping  all  distance 
measurements  unchanged  either  reduces  the  CRLB  for 
any  sensor  or  keeps  it  unchanged,  but  the  total  CRLB  in 
the  network  (excluding  the  sensor  being  replaced)  is 
definitely  reduced.  According  to  (23),  the  sufficient  and 
necessary  condition  for  the  unchanged  CRLB  is  that 
(V-_\  )sy  equals  0.  Fig.  2(d)  shows  an  example  in  which 
the  CRLBs  for  sensors  1,2,3  are  unchanged  after  sensor  5  is 
replaced  by  an  anchor. 

4.  A  divide  and  conquer  analysis  in  large-scale 
sensor  networks 

A  common  view  is  that  the  localization  error  of  a 
sensor  is  mainly  determined  by  nearby  network  elements, 
while  those  distant  sensors  and  anchors  have  little  effects. 
In  this  section,  we  shall  theoretically  analyze  this  point 
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and  then  propose  a  divide  and  conquer  method  to  analyze 
localization  errors  in  large-scale  networks. 

4.1.  Simulation  setup 

In  this  and  subsequent  sections,  we  shall  carry  out 
extensive  simulations  in  Matlab  to  verify  our  conclusions. 
One  common  step  involved  in  all  simulations  is  generating 
one-dimensional  networks:  a  network  is  produced  by 
randomly  and  uniformly  distributing  n  sensors  in  the 
range  of  [0,  L],  A  network  is  called  a  valid  network  only 
if  it  is  connected  and  the  locations  0  and  L  are  within  the 
distance  measurement  scopes  of  the  leftmost  and  the 
rightmost  sensors  respectively.  If  a  network  is  invalid, 
we  will  repeat  until  it  is  valid.  For  simplicity,  we  assume 
that  the  standard  deviations  of  distance  measurement 
errors  are  identically  1,  i.e.  Oy=  1.  The  values  of  n  and  L 
will  be  assigned  differently  for  different  simulations. 
Anchors  are  not  taken  into  consideration  at  this  stage, 
since  various  simulations  demand  different  anchor  place¬ 
ment  strategies. 

4.2.  Upper  and  lower  bound  networks 

We  first  define  upper  and  lower  bound  networks. 


a 


b 

c 


1  2  3  4  5 


_  _  O  sensor 
#  anchor 


Fig.  3.  Examples  of  an  upper  and  a  lower  bound  networks.  Given  an 
original  network  in  (a)  and  a  sensor  subset  S  =  {1,2,3. 4,5),  an  upper  and 
a  lower  bound  networks  are  constructed  in  (b)  and  (c),  respectively. 


Proof.  Without  loss  of  generality,  let  S  =  {1, . . .  ,p}.  Define 
the  FIMs  of  the  three  networks  as  J„JPU)  and  jft  respec¬ 
tively.  Because  both  A f  and  Af{u)  are  connected  and  A(u)  ^  0, 
the  three  FIMs  are  all  positive-definite.  Then  define  Cn,Cl,u> 
and  Cp!)  to  be  their  inverses  respectively  and  Cp  to  be  the 
leading  pxp  submatrix  of  Cn. 

In  A f,  replace  q  sensors  not  in  S  but  directly  connected 
to  sensors  in  S  by  additional  anchors  and  the  new  FIM  is  a 
block-diagonal  matrix: 


Definition  2.  Suppose  AT  =  (A, S,M)  is  a  connected  net¬ 
work.  Given  a  subset  S  s  S, 


(25) 


1.  a  upper  bound  network  Ar1’  is  defined  to  be 
(A,U),S,M(U)),  where  A,u)  =  {j|3d,j  e  M,  where  i  s  S,j  e  A] 
and  M,u)  =  {d,j|3d,j  s  M,  where  ij  e  S  uA(u)}; 

2.  a  lower  bound  network  A/'*1’  is  defined  to  be  (A(I),S,M(,)), 
where  Al!)  =A(U)  u  {i|(3d,j  e  M)v(3dp  e  M),  where  i  e 
(S\S),j  e  S)  and  M(,)  =  {dp  3dp  e  JVi,  where  i,j  e  S  uA®]: 


Because  such  replacement  neither  empties  the  anchor 
set  nor  disconnects  the  network,  J„_q  is  positive-definite: 


/  = 
J  n-q 


(26) 


We  define  a  relevant  distance  measurement  as  a 
distance  measurement  in  M  involving  a  node  in  S  and  a 
relevant  node  as  a  node  involved  in  a  relevant  distance 
measurement.  Then,  for  the  upper  bound  network  A f(u), 
A(u)  consists  of  all  relevant  anchors  and  M(u)  consists  of  all 
relevant  distance  measurements  between  any  two  nodes 
in  S  and  A(u);  for  the  lower  bound  network  Af(l),  A{1> 
consists  of  all  relevant  nodes  excluded  from  S  and 
consists  of  all  the  relevant  distance  measurements.  Fig.  3 
shows  examples  of  an  upper  and  a  lower  bound  networks 
associated  with  a  sensor  subset  S.  Clearly,  both  of  them 
are  determined  by  local  topologies  involving  S;  if  the 
upper  bound  network  is  connected,  the  corresponding 
lower  bound  network  must  be  connected  as  well.  The 
following  theorem  describes  the  relationships  between 
the  CRLBs  for  sensors  in  S  in  the  original  network  and  two 
bound  networks. 

Theorem  6.  Given  a  connected  network  Af  =  (A,S,M)  and  a 
subset  SgS,  construct  an  upper  bound  network  A ftu)  = 
(A(u>,S,Mtu>)  and  a  lower  bound  network  A f(l>  =  ( A{I),S,M(I) ). 
IfAf(u>  is  connected  and  A(u>^0,  for  any  sensor  s  e  S,  let  cs, 
4U)  and  c be  the  CRLBs  in  the  three  networks,  respectively, 
and  Cs0  <  cs  <  c sU). 


From  Theorem  5,  every  diagonal  entry  in  Cp  is  less  than 
or  equal  to  the  corresponding  entry  in  Cn.  Thus,  c®  <  cs. 
Secondly,  we  can  transform  A fiu)  into  Af  by  adding 
sensors,  anchors  and  distance  measurements.  According 
to  Theorems  2-4,  all  these  operations  do  not  increase  the 
CRLBs  for  sensors  in  Sp;  thus  cs  <  CsU).  □ 

Remark  9.  Theorem  6  indicates  that  an  upper  and  a 
lower  bounds  on  the  CRLB  for  a  sensor  can  be  derived 
provided  that  the  topology  information  within  a  certain 
limited  range  around  this  sensor  is  known.  Consider  a 
sensor  s  in  a  network  Af  =  {A,S,M}  and  two  sensor  subsets 

S  and  S  satisfying  s  e  S  and  S  c  S  c  S.  Let  A/'(u)  and  aT<u>  be 
upper  bound  networks  associated  with  the  two  subsets. 
Suppose  that  each  of  them  is  connected  and  contains  at 
least  one  anchor.  Let  cs  be  the  CRLB  for  sensor  s  in  Af  and 

4U)  and  cs<u>  be  the  CRLBs  for  sensor  s  in  A f(u)  and  Afm 
respectively.  According  to  Theorem  6,  both  4UI  and  Qtu, 
are  upper  bounds  on  cs.  Because  of  S  c  S,  A f,u)  can  be 

transformed  into  A f(u>  by  adding  sensors,  anchors  and 
associated  distance  measurements;  according  to 
Theorems  2-4,  we  can  obtain  c{su)  >  c^u)  >  cs,  implying 
that  the  upper  bound  based  on  a  large  subset  is  tighter 
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than  that  based  on  a  small  subset.  The  same  conclusion 
holds  for  the  lower  bound  case. 

4.3.  Partitioning  networks 

We  can  employ  anchors  as  boundaries  to  partition  a 
connected  network  A f  into  a  group  of  segments  such  that 
sensors  belonging  to  such  a  segment  form  a  sensor  subset. 
The  union  of  all  subsets  equals  the  sensor  set  of  J\f  and  the 
intersection  of  any  two  subsets  is  empty.  Given  a  sensor 
subset  in  A f,  we  establish  an  upper  bound  network  Af<u)  and 
a  lower  bound  network  Af{l),  which  are  both  unit  networks; 
the  CRLB  for  each  sensor  in  the  subset  can  be  bounded  using 
these  two  bound  networks.  In  fact,  we  can  transform  A f(u)  to 
A f(l)  by  placing  (maybe  zero)  additional  anchors  on  the 
boundaries  of  JV{U)  with  anchors  nearby  and  adding  distance 
measurements  associated  with  these  anchors.  According  to 
Remark  7,  additional  anchors  near  existing  anchors  have  little 
effects  on  reducing  the  CRLBs  and  consequently,  the  CRLBs  in 
A f(u>  are  close  to  those  in  A f(,)  such  that  both  bounds  are 
tight.  Because  the  tight  bounds  for  a  sensor  are  derived  based  on 
the  information  associated  with  the  sensor  subset  including  this 
sensor,  we  are  ensured  that  the  localization  error  of  this  sensor  is 
mainly  determined  by  the  local  part  of  the  whole  network. 

Simulations  are  conducted  to  verify  the  effectiveness 
of  this  partitioning  strategy.  In  each  generated  valid  net¬ 
work,  we  place  m  equally  spaced  anchors  in  the  range  of 
[0, 1]  with  two  of  them  at  0  and  I  respectively.  Then,  such 
a  valid  network  is  partitioned  into  m  —  1  segments  by  m 
anchors;  for  each  segment,  we  can  build  a  pair  of  upper 
and  lower  bound  networks.  Given  L=100  and  m  =  10,  we 
take  into  account  different  values  of  n  in  the  simulations. 
As  can  be  seen  from  Fig.  4,  the  CRLB  for  every  sensor  in  a 
valid  network  is  tightly  bounded  by  the  CRLBs  in  the 
corresponding  bound  networks. 

In  essence,  this  partitioning  strategy  amounts  to  a 
divide  and  conquer  method  in  that  the  analysis  of  loca¬ 
lization  errors  in  a  large-scale  network  can  be  converted 
into  the  analysis  in  small-scale  unit  networks.  If  these 
unit  networks  have  network  parameters  in  common,  we 
only  need  to  analyze  one  of  them,  instead  of  all  of  them. 

4.4.  Localization  errors  in  unit  networks 

At  first,  a  class  of  regular  unit  networks  are  shown  in 
Fig.  5,  and  the  regularity  means:  (1)  each  sensor  receives  h 
distance  measurements  from  its  left  neighboring  nodes  and 
another  h  distance  measurements  from  its  right  neighboring 
nodes;  (2)  to  guarantee  the  sensors  close  to  boundaries  can 
also  receive  h  distance  measurements  from  each  side,  h 
anchors  have  to  be  placed  at  the  left  side  of  all  sensors  and 
another  h  anchors  at  the  right  side. 

It  turns  out  that  sensors  have  the  identical  node  degree, 
i.e.  2  h,  in  a  regular  unit  network.  In  a  regular  unit 
network  with  node  degree  2, 4,  6,  2 n  (n  equals  the  number 
of  sensors),  if  the  standard  deviations  of  distance  mea¬ 
surement  errors  are  equal,  closed-form  formulas  for  the 
CRLBs  presented  in  [29]  exactly  characterize  the  beha¬ 
viors  of  localization  errors.  Furthermore,  the  following 
theorem  helps  us  understand  error  propagation  in  general 
unit  networks  based  on  regular  unit  networks. 


0.5 


o - . - 1 - 1 - 1 - 

0  20  40  60  80  100 

Sensor 

b  L=100,  n=300,  m=10 


0 - . - . - ■ - ■ - ■ - 

0  50  100  150  200  250  300 


Sensor 


Fig.  4.  The  CRLBs  in  actual  networks  (blue,  middle),  upper  bound 
networks  (red,  top)  and  lower  bound  networks  (cyan,  bottom)  with 
respect  to  different  sensor  densities.  (For  interpretation  of  the  references 
to  color  in  this  figure  legend,  the  reader  is  referred  to  the  web  version  of 
this  article.) 
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Fig.  5.  A  regular  unit  network  with  h=2. 
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Theorem  7.  JV  =  (A,S,M)  (A^=0)  is  a  connected  unit  net¬ 
work.  Let  dmax  and  dmin  be  the  maximum  and  minimum  node 
degrees  of  the  graph  corresponding  to  Af.  Construct  two 
regular  unit  networks  A/'<ft,)  andAf{k2)  with  the  same  number 
of  sensors  as  in  A f  and  with  unique  node  degrees  ki  and  k2 
(k !  <  k2),  respectively.  Let  Cm  C(nkl>  and  C^2)  be  the  inverses  of 
the  FIMsfor  the  three  networks.  ///<i  <  dmin  <  dmax  <  k2,  then 
for  every  sensor  s  (1  <  s  <  n)  (C*>)ss  >  (Cn)ss  >  (Cf2))ss. 

Proof.  If  fci  <  dmin  <  dmax  <  k2,  AL(k,)  can  be  transformed 
into  A f  by  adding  anchors  and  distance  measurements 
between  sensors  or  between  sensors  and  anchors  and 
based  on  Theorems  2  and  4,  these  modifications  do  not 
decrease  the  CRLBs  and  thus  (C(nk'y)ss  >  (Cn)ss.  Similarly,  we 
have  (C„)ss  >  (C*,k2))ss.  □ 

Remark  10.  Theorem  7  indicates  that  in  a  unit  network, 
the  CRLB  for  each  sensor  can  be  bounded  using  regular 
unit  networks.  If  dmax~dmin  is  small,  so  is  k2-k^;  the 
bounds  are  tight  and  thus  helpful.  But  it  is  less  probable 
for  dmax-dmin  to  be  small  in  a  unit  network  in  which 
sensors  are  randomly  and  uniformly  distributed.  Suppos¬ 
ing  the  mean  and  standard  deviation  of  node  degrees  in  a 
unit  network  to  be  x  and  <5,  one  could  postulate  that  an 
approximation  to  performance  would  follow  from  a 
regular  unit  network  calculation  in  which  the  parameters 
dmi n  and  dmax  are  approximated  by  dmin=x-5  and 
dmax  =x+d  respectively. 

Given  1=100  and  different  values  of  n,  simulations  are 
conducted  to  validate  this  approximation  method  by 
generating  four  valid  unit  networks  with  two  anchors  at 
0  and  L  respectively.  As  shown  in  Fig.  6,  dmin  and  dmax 
produce  tighter  bounds  than  dmin  and  dmax.  Although  the 
curves  corresponding  to  the  CRLBs  in  the  generated  valid 
unit  networks  are  not  as  smoothly  parabolic  as  in  regular 
unit  networks,  they  share  two  common  properties  that  the 
CRLBs  increase  at  a  quadratic  speed  and  the  sensors  with  the 
maximum  CRLBs  lie  around  the  middle  of  unit  networks. 
Moreover,  the  upper  and  lower  bounds  based  on  regular 
unit  networks  allow  us  to  predict  how  fast  localization 
errors  grow  in  a  unit  network  given  the  knowledge  about 
sensor  density  which  determines  both  x  and  5. 

5.  Further  discussion 

In  this  section,  two  practical  issues  are  investigated. 
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5.1.  Applicability  of  the  analysis  based  on  the  CRLB 

An  analysis  based  on  the  CRLB  is  often  faced  with  a 
critical  question  whether  the  CRLB  is  obtainable  in  actual 
localization  procedures,  which  determines  the  applicabil¬ 
ity  of  the  analysis.  We  first  consider  a  localization  proce¬ 
dure  in  which  all  sensors  are  localized  simultaneously 
using  the  MLE.  Given  a  one-dimensional  network 
J\T  =  (A,S,M),  Xj  and  Xj  are  true  locations  of  i  and  j;  if 
x,  >  Xj,  the  distance  measurement  dy  can  be  expressed  as 
follows: 

(27) 


Fig.  6.  The  actual  CRLBs  and  their  upper  bounds  and  lower  bounds  in 
unit  networks.  The  actual  CRLBs  are  plotted  in  solid  blue  curves.  Upper 
bounds  with  red  dashed  curves  are  produced  using  dmi„,  lower  bounds 
with  cyan  dashed  curves  using  dmax,  upper  bounds  with  red  solid  curves 
using  dm,„  and  lower  bounds  with  cyan  solid  curves  using  dmax.  Note 
that  the  red  solid  and  the  red  dashed  curves  overlap  in  (a).  (For 
interpretation  of  the  references  to  color  in  this  figure  legend,  the  reader 
is  referred  to  the  web  version  of  this  article.) 

Considering  distance  measurements  from  at  least  one 
sensor,  the  corresponding  equations  like  (27)  can  be 
stacked  together  and  written  in  vector  form 


dij=Xi-Xj+eij. 


a 


(28) 
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where  z  is  the  distance  measurement  vector  of  order  s, 
say;  a  is  the  anchor  location  vector  of  order  m;  P  and  Q  are 
s  x  n  and  s  x  m  matrices  consisting  of  only  0,  —  1,  1;  e  is 
the  error  vector  associated  with  z;  x  is  the  vector  contain¬ 
ing  n  parameters  (i.e.  sensor  locations)  to  be  estimated. 
Equivalently,  (28)  can  be  expressed  as 

z-Qa  =  Px+e.  (29) 

Then,  the  generalized  least  squares  method  finds  estimate 
x  by  solving 

x  =  arg  min[(z-()a-Px)r(V,ar(e)r1(z-Qa-Px)].  (30) 

X 

Because  e  is  Gaussian,  the  same  estimate  x  can  also  be 
derived  by  the  MLE.  Moreover,  the  estimate  x  has  an 
explicit  formula 

x  =  (PT(Var(e)r1Pr1PT(Var(e)r\ z-Qa),  (31 ) 

and  its  covariance  matrix  is 

Var(x)  =  (PT(Var(e))^Pr\  (32) 

Regarding  P,  its  row  number  s  equals  the  number  of 
distance  measurements  from  at  least  one  sensor  and  its 
column  number  n  equals  the  number  of  sensors.  For  the 
fc-th  distance  measurement,  if  it  involves  two  sensors  i 
and  j,  (P)ki  =  1,  (P)kj  =  -1  (or  (P)ki  =  -1  and  (P)kj  =  1  which 
does  not  change  the  matrix  PT(Var(e))~^  P)  and  all  the 
other  entries  in  this  row  are  0;  if  it  involves  one  sensor  i 
and  one  anchor  j,  (P)ki  =  1  (or  (P)fci  =  —  1  which  neither 
changes  the  matrix  PT  Var(e)P)  and  all  the  other  entries  in 
this  row  are  0.  Moreover,  due  to  the  assumption  of 
independent  errors,  Var(e)  is  an  s  x  s  diagonal  matrix. 
Then,  it  is  straightforward  to  verify  that  PT(Var(e))^P 
equals  Jn.  Thus,  the  MLE  attains  the  CRLB. 

Therefore,  our  analysis  is  certainly  applicable  here.  But 
the  above  localization  procedure  essentially  depends  on 
centralized  processing  which  is  impractical  for  large-scale 
networks.  Decentralized  processing  can  in  fact  be 
considered  also. 

Recalling  the  construction  of  upper  bound  networks  in  a 
large-scale  network  introduced  in  Section  4.3,  it  is  clear  that 
each  upper  bound  network  is  actually  a  subnetwork  (or 
subset)  of  the  original  large-scale  network.  Assuming  that 
one  node  is  elected  from  such  a  subnetwork  and  collects  all 
information  within  this  subnetwork,  including  available  dis¬ 
tance  measurements  and  anchor  positions,  a  locally  centra¬ 
lized  localization  procedure  using  the  MLE  can  be  performed 
on  this  node.  Firstly,  sensors  in  this  subnetwork  are  localiz- 
able  since  the  corresponding  upper  bound  network  is  a  unit 
network  and  consists  of  two  anchors  at  two  ends.  Secondly, 
considering  the  comparatively  small  scale  of  such  subnet¬ 
works,  it  is  convenient  to  fulfill  the  assumption  about  the 
node  election  and  information  collection  such  that  a  decen¬ 
tralized  localization  procedure  is  realized.  Thirdly,  according 
to  the  result  about  the  tightness  of  upper  bounds  derived 
from  upper  bound  networks  in  Section  4.3  which  is  well 
reflected  in  simulations,  the  localization  errors  produced  in 
the  decentralized  localization  procedure  (i.e.  the  CRLB  in 
upper  bound  networks)  is  much  closer  to  the  errors  produced 
in  the  centralized  localization  procedure  (i.e.  the  CRLB  in  the 
original  large-scale  network).  Hence,  the  analysis  based  on 
the  CRLB  is  also  indicative  for  decentralized  processing. 


5.2.  The  optimal  placement  of  anchors 

As  we  mentioned  previously,  anchor  has  significant 
impacts  on  localization  errors.  Assuming  anchors’  locations 
are  controllable,  an  optimal  anchor  placement  should  mini¬ 
mize  the  total  (or  average)  CRLB  in  a  network,  which  is 
intrinsically  an  optimization  problem.  Here  we  give  a 
preliminary  discussion  based  on  the  preceding  analysis. 

According  to  Remark  7,  an  additional  anchor  should  be 
placed  near  the  sensor  with  the  largest  CRLB  in  a  network 
to  efficiently  reduce  the  total  CRLB.  Imagining  sensors  are 
randomly  and  uniformly  distributed  in  a  network  with  two 
anchors  at  both  ends,  in  the  light  of  the  discussion  in 
Section  4.4,  the  sensor  with  the  largest  CRLB  will  probably 
lie  around  the  middle  of  the  network,  and  hence,  it  is 
reasonable  to  place  the  additional  anchor  at  the  middle  of 
the  network;  by  iteratively  applying  this  strategy,  anchors 
are  finally  equally  spaced  in  the  network.  Thus,  we  con¬ 
jecture  that,  from  the  statistical  point  of  view,  distributing 
anchors  with  equal  spacing  should  be  optimal  in  networks 
with  randomly  and  uniformly  distributed  sensors. 

In  the  simulations,  we  generate  100  valid  networks 
and  consider  1000  different  anchor  placement  cases  in 
which  apart  from  the  proposed  case  that  anchors  are 
equally  spaced,  anchors  are  randomly  and  uniformly 
placed.  The  variance  of  anchor  intervals  is  defined  to  be 
the  variance  of  the  lengths  between  any  pair  of  adjacent 
anchors,  and  reflects  how  far  one  anchor  placement  case 
is  away  from  the  equal  spacing  case;  in  particular,  the 
variance  of  anchor  intervals  for  the  equal  spacing  case  is 
0.  Then,  we  implement  each  anchor  placement  case  in  the 
localization  procedures  of  the  100  valid  networks,  and 
compute  the  average  CRLB.  In  the  end,  the  simulation 
results  are  plotted  in  Fig.  7.  As  can  be  seen,  the  minimum 
average  CRLB  appears  to  increase  almost  linearly  with  the 
variance  of  anchor  intervals,  and  the  minimum  occurs 
when  the  variance  is  0,  which  verifies  that  the  equal 
spacing  case  is  optimal  in  a  statistical  sense. 
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Fig.  7.  The  average  CRLB  with  different  anchor  placement  strategies. 
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6.  Conclusions 

In  this  paper,  we  investigated  localization  errors  in 
one-dimensional  sensor  networks  in  terms  of  the  CRLB. 
We  theoretically  analyzed  how  different  factors  affect  the 
CRLB  and  thus  localization  errors,  and  concluded  that  the 
localization  error  for  a  sensor  is  mainly  determined  by  the 
local  elements  within  a  certain  limited  range  around  this 
sensor.  The  analysis  of  localization  errors  in  a  large-scale 
network  can  be  broken  down  into  the  analysis  in  a 
number  of  small-scale  unit  networks.  Also,  we  discussed 
two  practical  issues,  viz.  the  applicability  of  the  analysis 
based  on  the  CRLB  and  the  optimal  anchor  placement. 

The  physical  embodiment  of  a  one-dimensional  sensor 
network  does  not  necessarily  involve  an  ideal  straight  line. 
It  might  involve  a  large  circle,  a  curving  road,  an  irregular 
boundary  of  a  region  or  a  coastline  for  example,  as  long  as 
the  curvature  is  small.  Given  a  one-dimensional  sensor 
network,  the  localization  task  aims  to  determine  the 
distance  from  each  sensor  to  some  starting  point  of  this 
network.  Provided  that  this  network  is  deployed  over  a 
non-straight  line,  this  distance  is  then  interpreted  as  the 
accumulated  distance  from  this  sensor  to  the  starting  point 
along  this  non-straight  line.  In  this  case,  if  the  curvature  of 
this  line  is  relatively  small  and  Euclidean  distance  mea¬ 
surements  between  any  pair  of  nodes  are  close  to  the 
corresponding  accumulated  distances,  localization  results 
are  then  consistent  with  the  above  interpretation. 

Furthermore,  although  all  these  results  are  limited  to  one¬ 
dimensional  sensor  networks  and  cannot  be  easily  extended 
to  two-  and  three-dimensional  cases,  the  treatments  we 
employed  here  might  be  still  useful.  Our  analysis  relies  on 
the  closed-form  formulas  for  the  CRLB  in  one-dimensional 
regular  unit  networks,  in  which  FIMs  are  banded  symmetric 
Toeplitz  matrices;  in  two-dimensional  lattice  networks  with 
anchors  on  the  boundary  and  surrounding  sensors,  FIMs  are 
actually  symmetric  block-tridiagonal  Toeplitz  matrices, 
which  makes  it  possible  for  us  to  deal  with  localization  errors 
in  two-dimensional  sensor  networks  in  a  like  manner  to  that 
we  used  in  the  one-dimensional  case. 
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Connectivity-based  Distance  Estimation  in  Wireless  Sensor  Networks 
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Abstract — Distance  estimation  is  of  great  importance  for  local¬ 
ization  and  a  variety  of  applications  in  wireless  sensor  networks. 
In  this  paper,  we  develop  a  simple  and  efficient  method  for 
estimating  distances  between  any  pairs  of  neighboring  nodes  in 
static  wireless  sensor  networks  based  on  their  local  connectivity 
information,  namely  the  numbers  of  their  common  one-hop 
neighbors  and  non-common  one-hop  neighbors.  The  proposed 
method  involves  two  steps:  estimating  an  intermediate  parameter 
through  a  Maximum-Likelihood  Estimator  (MLE)  and  then 
mapping  this  estimate  to  the  associated  distance  estimate.  In  the 
first  instance,  we  present  the  method  by  assuming  that  signal 
transmission  satisfies  the  ideal  unit  disk  model  but  then  we 
expand  it  to  the  more  realistic  log-normal  shadowing  model. 
Finally,  simulation  results  show  that  localization  algorithms  using 
the  distance  estimates  produced  by  this  method  can  deliver 
superior  performances  in  most  cases  in  comparison  with  the 
corresponding  connectivity-based  localization  algorithms. 

I.  Introduction 

Wireless  sensor  networks,  comprised  of  hundreds  or  thou¬ 
sands  of  small  and  inexpensive  nodes  with  constrained  com¬ 
puting  power,  limited  memory  and  short  battery  lifetime, 
can  be  used  to  monitor  and  collect  data  in  a  region  of 
interest.  Accurate  and  low-cost  sensor  (the  word  “sensor” 
connotes  a  node  of  unspecified  location)  localization  is  a 
critical  requirement  for  a  wide  variety  of  applications  in 
wireless  sensor  networks,  and  great  efforts  have  been  invested 
in  developing  localization  algorithms  including  both  distance- 
based  algorithms  and  connectivity-based  algorithms. 

In  reality,  exact  distance  measurement  is  usually  unavailable 
and  has  to  be  estimated  from  information  such  as  received 
signal  strength  (RSS),  time  of  arrival  (TOA),  or  time  difference 
of  arrival  (TDOA)  [1],  In  large-scale  sensor  networks,  it  is, 
however,  impractical  to  localize  all  sensors  by  using  additional 
hardware  such  as  GPS  receivers  and  measuring  devices  due 
to  cost  constraints.  On  the  other  hand,  although  classes  of 
connectivity-based  localization  algorithms  without  using  any 
additional  measuring  devices  have  been  proposed  [2],  [3], 
achieving  a  high  localization  accuracy  usually  demands  a 
comparatively  large  number  of  anchor  nodes,  hereafter  termed 
simply  anchors,  whose  positions  are  known  a  priori. 

In  a  static  wireless  sensor  network,  two  nodes  are  termed 
one-hop  neighbors  or  simply  neighbors  as  long  as  they  can 
communicate  with  each  other.  An  intuitive  observation  shows 
that  two  geographically  close  neighboring  nodes  often  share 
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more  common  one-hop  neighbors  than  two  distant  nodes. 
In  this  paper,  we  quantify  and  exploit  this  observation  to 
develop  a  method  for  estimating  the  distance  between  any 
pair  of  neighboring  nodes  based  on  their  local  connectiv¬ 
ity  information,  i.e.  the  numbers  of  their  common  one-hop 
neighbors  and  non-common  one-hop  neighbors.  Our  method 
involves  two  steps:  first,  an  intermediate  parameter  relating  the 
distance  and  the  numbers  of  different  types  of  neighbors  is 
estimated  based  on  a  Maximum  Likelihood  Estimator  (MLE); 
second,  through  a  mapping  function  we  can  obtain  the  distance 
estimate  from  the  estimate  in  the  first  step.  After  presenting 
this  method  for  the  unit  disk  model,  we  expand  it  to  the 
more  realistic  log-normal  shadowing  model.  Such  distance 
estimates  can  be  directly  used  by  distance-based  localization 
algorithms,  and  in  comparison  with  traditional  connectivity- 
based  localization  algorithms,  a  significant  improvement  on 
localization  accuracies  is  reported  through  simulations. 

The  advantages  of  the  proposed  method  are:  independent  of 
additional  hardware;  totally  distributed;  energy  efficient  due 
to  its  simple  mechanism  and  computations.  Prior  to  our  work, 
[4],  [5]  came  up  with  the  method  of  estimating  distances  based 
on  the  same  idea  as  ours;  their  treatments  rest  on  empirical 
observations  rather  than  theoretical  foundations.  In  comparison 
to  their  work,  our  paper:  (1 )  formalizes  and  gives  mathematical 
proofs  of  the  correctness  of  the  method;  (2)  bases  the  method 
on  the  MLE  in  both  the  unit  disk  model  and  the  more  realistic 
log-normal  shadowing  model;  (3)  reports  the  performance 
improvement  on  localization  by  using  the  distance  estimates 
produced  by  the  method  through  simulations. 

The  remainder  of  the  paper  is  organized  as  follows:  Section 
II  introduces  the  method  of  estimating  distances  in  the  unit 
disk  model  and  Section  III  expands  it  to  the  log-normal 
shadowing  model.  Section  IV  investigates  the  performance 
improvement  on  localization  by  using  the  proposed  method 
through  simulations.  Finally,  Section  V  concludes  the  paper. 

II.  Estimating  Distances  in  the  Unit  Disk  Model 

We  first  introduce  the  network  model  used  here  and  then 
present  the  method  in  the  unit  disk  model. 

A.  Network  Model 

Considering  a  static  wireless  sensor  network  where  nodes 
are  randomly  and  uniformly  distributed  in  2-dimensional  re¬ 
gion,  a  homogeneous  Poisson  process  provides  an  accurate 
model  for  the  distribution  of  nodes  as  the  network  size 
approaches  infinity  [6].  Let  A  denote  the  node  density  (the 
number  of  nodes  per  unit  square)  and  the  probability  mass 
function  of  the  number  of  nodes  N  in  an  area  D  is  given  by 

Pr(N  =  n)  =  e~XD  (1) 

n\ 
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Fig.  1 .  Communication  coverage  of  two  neighboring  nodes  in  the  unit  disk 
model. 


three  kinds  of  neighbors  and  define  a  key  parameter  p: 

E{2M) 

P~  E(2M  +  P  +  Q) 


(2) 


where  E(-)  denotes  the  expected  value  of  a  random  variable. 

As  pointed  out  in  [7],  M,P,Q  are  mutually  indepen¬ 
dent  Poisson  distributed  random  variables  with  the  means 
XS-[ ,  XS'2-  XS2-  Equivalently,  p  can  also  be  expressed  as 


Si  =  Si_ 

Si  +  S2  7T  r2 


(3) 


According  to  the  geometries  among  d,r  and  Si,  we  have 


Fig.  2.  The  true  and  approximate  functions  from  d  to  p  (r  =  1). 
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In  effect.  Si  is  a  monotonically  decreasing  function  of  d, 
and  so  is  p.  Hence,  as  long  as  p  is  available,  we  can  compute 
d  by  using  its  inverse  function,  termed  a  mapping  function. 

The  actual  values  of  M,  P  and  Q  can  be  easily  obtained  in 
the  light  of  Assumption  2  and  can  be  furthered  employed  to 
estimate  p  according  to  Theorem  1. 

Theorem  1:  Given  three  independent  Poisson  distributed 
random  variables  M ,  P  and  Q  which  define  p  =  ^m+p+q)  > 
the  MLE  for  p,  termed  p,  is 


P  = 


1, 

2  M 

2M  +  P+Q’ 


if  m=p=q=o 


otherwise 


(5) 

(6) 


where  Pr(-)  denotes  the  probability  of  a  statistical  event. 
Obviously,  N  is  a  Poisson  random  variable  with  mean  XD. 

Moreover,  throughout  this  paper  we  assume  that: 

Assumption  1:  Nodes  are  Poissonly  distributed  with  a  uni¬ 
form  density  A  in  an  infinite  plane. 

Assumption  2:  Each  node  knows  the  neighborhood  infor¬ 
mation  (i.e.  the  list  of  one-hop  neighbors)  of  all  its  one-hop 
neighbors. 

Assumption  1  avoids  the  boundary  effect  and  Assumption 
2  ensures  that  local  connectivity  information  is  available  for 
each  node  to  carry  out  the  distance  estimation  method. 

B.  Estimating  Distances 

Given  a  static  wireless  sensor  network  conforming  Assump¬ 
tion  1,  the  node  transmission  range  is  identically  r  as  we  are 
discussing  with  the  unit  disk  model. 

As  shown  in  Fig.  1,  provided  two  neighboring  nodes  A  and 
B  separated  by  distance  d  (d  <  r ),  two  circles  with  radii  r 
and  centered  at  A ,  B  represent  their  individual  communication 
coverage,  and  intersect  and  create  three  disjoint  regions.  The 
nodes  residing  in  the  middle  one  are  common  one-hop  neigh¬ 
bors  of  A  and  B  and  the  nodes  residing  in  the  left  (right)  one 
are  non-common  one-hop  neighbors  of  A  ( B ).  Define  ,Sj  to 
be  the  area  of  the  middle  one  and  both  the  areas  of  the  left 
and  the  right  ones  are  nr2  —  Si,  denoted  S2 .  Moreover,  we 
let  three  random  variables  M ,  P ,  Q  denote  the  numbers  of  the 


Proof:  Establish  a  statistical  model:  measured  data  are 
observations  of  M,  P  and  Q,  denoted  <f>  =  [  m  p  q  ]  where 
m,p,  q  are  non-negative  integers;  the  unknown  parameters  are 
9  =  [  p  X  ].  The  likelihood  function  of  this  model  is 

£(6,  f)  =  Pr(M  =  to)  x  Pr(P  =  p)  x  Pr(Q  =  q)  (7) 


The  MLE  is  the  solution  to  the  following  equation  set 


which  yields 


din  £(9,<j>) 

89 

(8) 

m  f+"+A„2=0 

P  1  -  P 
m+p  +  q 

w  12  p)  = 0 

,  we  can  obtain 

(9) 

(10) 

2  to  =  (2m  +  p  +  q)p 

(11) 

If  2 to  +  p  +  q  >  0,  i.e.  m,p  and  q  are  simultaneously  0, 
the  solution  for  p  is  2m+p+q  ’  otherwise,  the  solution  for  p  is 
not  well-defined.  But  because  p  =  1  maximizes  the  likelihood 
when  2 m+p  +  q  =  0,  we  obtain  the  MLE  for  p,  termed  p,  as 


P  = 


1, 

2  M 

2  M  +  P  +  Q’ 


if  M=P=Q=0 
otherwise 
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By  substituting  (4)  into  (3)  and  applying  the  first  order 
Taylor  series  expansions  on  d  =  0,  we  can  obtain 


shadowing  model  is  equivalent  to  the  unit  disk  model  with  the 
communication  range  of 


As  depicted  in  Fig.  2,  (12)  displays  a  good  approximation 
to  the  true  function  from  d  to  p  when  0  <  d  <  r.  As  such, 
the  mapping  function  is  approximately 

d«y(l-p)  (13) 

which  enables  us  to  obtain  the  estimate  of  d,  i.e.  d,  from  p. 

III.  Estimating  Distances  in  the  Log-normal 
Shadowing  Model 

Before  expanding  the  method  to  the  more  realistic  log¬ 
normal  shadowing  model,  we  make  the  following  assumptions 
(as  is  commonly  the  case  in  the  literature): 

Assumption  3:  The  attenuations  of  the  transmit  powers  be¬ 
tween  any  pairs  of  nodes  are  independent  and  identically 
distributed  (i.i.d.)  1 ; 

Assumption  4:  Communication  links  are  symmetric, 
namely  that  node  v  can  directly  receive  packets  from  node  u 
as  long  as  node  u  can  directly  receive  packets  from  node  v; 

Assumption  5:  All  nodes  transmit  at  a  fixed  power  level. 

Although  these  assumptions  may  not  fully  reflect  a  real 
network  environment,  they  still  enable  us  to  obtain  some 
results  as  estimates  for  more  realistic  situations. 

A.  Log-normal  Shadowing  Model 

The  log-normal  shadowing  model  predicts  the  received 
signal  power  by  a  receiver  with  distance  d  from  a  transmitter, 
denoted  Pji(d),  to  be  log-normally  distributed  around  the 
ensemble  average  received  power,  denoted  I:>ii(d)\  this  model 
is  based  on  a  wide  variety  of  measurement  results  [11]  as  well 
as  analytical  evidence  [12],  and  is  typically  modeled  as  [1] 

Pn(d)(dBm)  =  Pn(do)(dBm)  —  10a  log  —  +  Z  (14) 

do 

where  Z  is  a  random  variable  representing  the  shadowing 
effect,  normally  distributed  with  mean  zero  and  variance  er2; 
Pii(d())  (dBm)  is  the  ensemble  average  received  signal  power 
in  dBm  at  a  short  reference  distance  do;  a  is  the  path-loss 
exponent.  Both  a  and  a  are  a  priori  known  constants;  typically, 
a  is  as  low  as  4  and  as  high  as  12,  and  a  varies  between  2  in 
free  space  to  6  in  heavily  built  urban  areas  [11]. 

Given  a  transmitter  A  and  a  receiver  B,  if  I'liid)  is  no  less 
than  some  specified  value  Pc,  a  directional  communication  link 
exists  from  A  to  B;  equivalently,  a  directional  communication 
link  also  exists  from  B  to  A  due  to  Assumption  4.  In  particular, 
if  the  shadowing  effect  Z  vanishes,  i.e.  er  =  0,  the  log-normal 

'Even  though  field  measurements  in  real  applications  seem  to  indicate  that 
the  attenuations  between  two  links  with  a  common  node  are  correlated  [8],  this 
i.i.d  assumption  is  generally  considered  appropriate  for  far  field  transmission 
and  is  widely  used  in  the  literature  [8]— [10]. 


Pfl(do).A 
r  =  do  (— p — )“ 


(15) 


which  is  a  constant  given  a.  Otherwise,  the  probability  that 
two  nodes  with  distance  d  can  communicate  is 


9(d)  = 


'  k\  In  ' 


s/2tt( 


-dz 


(16) 


where  fci  = 


ICta 
In  10  ■ 


B.  Distributions  of  M,  P,  Q 

We  still  use  M,  P,  Q  to  denote  the  numbers  of  common  and 
non-common  one-hop  neighbors  associated  with  A  and  B.  The 
following  theorem  and  corollary  provide  their  distributions  in 
the  log-normal  shadowing  model. 

Theorem  2:  Suppose  a  sensor  network  where  nodes  are 
randomly  and  uniformly  distributed  with  density  A  in  a  disk  of 
radius  R;  given  two  nodes  A  and  B,  let  M  be  the  number  of 
their  common  one-hop  neighbors  and  P  and  Q  be  the  numbers 
of  their  non-common  one-hop  neighbors.  M,  P  and  Q  are 
Poisson  random  variables  in  the  limiting  case  of  R  — >  oo. 

Proof:  Let  (xi,yi)  and  (£2,2/2)  be  the  positions  of  A 
and  B.  Given  an  arbitrary  node  C,  there  exist  four  cases  with 
regard  to  communications  between  A,P  and  C: 

1)  C  can  directly  communicate  with  both  A  and  B\ 

2)  C  can  directly  communicate  with  A  but  not  P; 

3)  C  can  directly  communicate  with  B  but  not  A; 

4)  C  cannot  directly  communicate  with  either  A  or  B. 

Apparently,  M  is  the  number  of  nodes  satisfying  the  Case 

1;  P  (or  Q )  is  the  number  of  nodes  satisfying  the  Case  2 
(or  3).  Supposing  the  disk  area  is  D,  the  probabilities  that  C 


satisfies  the 

i-th  ( i  =  1,2, 3, 4)  case,  termed  pi,  are 

Pi 

=  J  J^g(di)g(d2)dxdy 

(17) 

P2 

=  Jf  J  J  g(di)(l-g(d2))dxdy 

(18) 

P3 

=  ff  J  Jd0- ~  g(di))g(d2)dxdy 

(19) 

P4 

=  ~D  J  J^1  ~  s(di))(l  ~  g(d2))dxdy 

(20) 

where  d\ 

=  y/(x-  X1)2  +  (y  -  yi)2  and  d2 

= 

V(x~  x2)2 

+  (y-  y2)2- 

To  obtain  the  number  of  one-hop  neighbors  of  A,  i.e.  M+P, 
we  conduct  a  test  for  each  node  in  the  network  except  for  A 
and  B  to  decide  whether  it  satisfies  the  Case  1  or  2,  and 
then  the  total  number  of  successful  tests  is  M  +  P.  Due  to 
Assumption  3,  all  of  the  tests  are  independent  of  each  other 
and  hence,  the  test  process  is  a  Bernoulli  process.  Then,  M+P 
follows  a  Binomial  distribution  with  the  total  number  of  tests 
n  =  AD— 2  and  the  success  probability  p  =  pi+p2-  Moreover, 
if  the  limiting  case  of  lim  np  converges,  the  distribution  of 

n — kx> 

M  +  P  is  Poisson  with  expected  value  lim  np. 
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Because  of 


lim  np  =  lim  [A  /  /  g(d\)dxdy]  (21) 

n  *°°  D^°°  J  Jd 

,  we  let  (x\,  t/i)  be  the  origin  and  d\  =  R,  and  transform  (21) 
into  the  Polar  coordinate  system 

rR 

lim  np  =  lim  [A  /  /  g{x)xdxd6] 

n  *°°  Jo  Jo 


=  2nX 


rxr 

Jo 


e  = 


In  -  V^27T 


dzdx  (22) 


When  x  is  no  less  than  some  value,  say  a,  —  In  f  >  1  and 

.2 

■"  r°°  e-V 


lim  np  =  27tA[ 


ii  in  *  y/2n 
2 

30  e-^ 


dzdx 


J*  In  = 

pOO 


<  27rA[  /  a; 

Jo 


V2^ 

e  2 


K  In  -  V^27T 

;  r 

ze  2 


cfocfa;] 

dzdx 

dzdx]  (23) 


In  £  v/27T 

It  is  easy  to  judge  that  the  two  integrals  in  the  right  hand 
side  of  the  inequality  definitely  converge  and  then  lim  np  < 

n — >oo 

oo.  Therefore,  M  +  P  follows  a  Poisson  distribution.  We  can 
obtain  the  same  result  for  M+Q.  Regarding  M,  P ,  Q,  because 
their  success  probabilities  pi ,  p>2  and  p3  in  the  corresponding 
Bernoulli  processes  are  less  than  that  of  M+P,  lim  np  <  oo 

n — kx> 

when  p  =  pi ,  [>2 ,  Pi  and  thus  it  is  straightforward  that  M,  P 
and  Q  follow  Poisson  distributions.  ■ 

Corollary  1:  M,  P  and  Q  are  mutually  independent  in  the 
limiting  case  of  R  — >  oo. 

Proof:  Define  two  events  M  <  m  and  Q  <  q.  Then, 

Pr(M  <  to  fl  Q  <  q) 


=  Pr(M  <  to  D  Q  =  j) 

3=0 

9 

=  y  [Pr(M  <  m\Q  =  j)Pr(Q  =  j)] 


3=0 
q  m 

=  £  YJPr^M  =  *IQ  =  jWQ  =  j)]  (24) 

j— 0  i—0 

Consider  the  limiting  case  of  R  — >  oo  (equivalently  n  — > 

oo) 

lim  Pr(M  <  mtlQ  <  q) 

n — >oo 

q  m 

=  iim  £  y]ipr(M = *i  q = j)pr(Q = j)} 

n — kx>  L L ' 
j— 0  2—0 

Pi  \i„~{n~j) 

~P3 


=  lim  V 


( np3y 


j— 0  2=0 


j! 


(25) 


According  to  the  proof  of  Theorem  2,  when  p  =  pi,p2,p3, 
lim  np  <  oo,  and  then  lim  p  =  0.  Hence, 

n — kx>  n — kx> 


lim  Pr{M  <  m  H  Q  <  q) 


=  iEE 


(npi)V-"Pl  w  (np3)je-npf 

r;  X  ~ 


.7=0  2=0 


=  I™  [£ 

n. — >fX:  z - ■» 


(npi)1 


2  =  0 


P 

xE 

3=0 


3- 

(rvpffU 

j! 


=  lim  [V''  Pr(M  =  i)  x  y'  Pr(Q  =  j )] 

n — kx)  ^  ^ 

*=0  J=0 

=  lim  [Pr(M  <  to)  x  Pr(Q  <  q)]  (26) 


Therefore,  M  and  Q  are  independent  as  P  — >  oo.  Similarly, 
we  can  obtain  that  M,  P  and  Q  are  mutually  independent 
through  the  similar  approach.  ■ 

In  the  log-normal  shadowing  model.  Theorem  2  and  Corol¬ 
lary  1  assure  us  to  apply  Theorem  1  to  arrive  at  the  estimate 
of  p.  According  to  [10],  the  number  of  one-hop  neighbor  of 

2a-2 

a  node  in  the  log-normal  shadowing  model  equals  Xnr2e  k‘  , 
which  is  in  fact  E(M  +  P)  and  E(M  +  Q).  Furthermore, 
based  on  the  proof  of  Theorem  2,  we  have 


nco  p2i r 

E(M )  =  A  /  /  g(x)g(\J x2  +  d2  —  2 xd cos  9)xdxdQ 

Jo  Jo 


(27) 


And  then  we  have 


P  = 


r  f0  ^  g(x)9(Vx2  +  d2  —  2xd cos  0)xdxd6 
- ^ -  (28) 


which  formalizes  the  functional  relationship  between  p  and  d. 
Though  (28)  is  not  closed-form,  we  can  produce  a  piecewise 
linear  function  to  approximate  the  inverse  function  of  (28)  as 
the  mapping  function  and  then  estimate  d  from  p. 


IV.  Simulations 

We  conduct  simulations  in  Matlab.  Since  distance  estimates 
are  available,  it  is  feasible  to  localize  sensors  by  using  a  variety 
of  distance-based  localization  algorithms.  In  order  to  evaluate 
the  proposed  method  in  a  fair  environment,  we  shall  investigate 
two  well-known  connectivity-based  localization  algorithms 
which  are  also  applicable  with  distance  measurements,  i.e.  DV- 
hop  and  MDS-MAP;  their  distance-based  versions  are  termed 
DV-distance  and  MDS-MAP  distance  respectively.  Due  to  the 
length  limitation,  refer  [2],  [3]  for  more  details  about  these 
localization  algorithms. 

To  avoid  the  boundary  effect,  we  actually  generate  sensor 
networks  over  a  large  square  with  size  of  18  x  18,  but  only 
localize  the  nodes  inside  a  small  square  with  size  of  6  x  6 
and  centered  at  the  same  center  of  the  large  square.  However, 
the  nodes  outside  of  the  small  square  are  sometimes  used  in 
estimating  distances  between  nodes  within  the  small  square. 
Four  nodes  closest  to  the  four  corners  of  the  small  square  and 
inside  of  the  small  square  are  chosen  as  anchors. 
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In  the  simulations,  the  path-loss  exponent  a  is  known  to 
be  4,  and  without  loss  of  generality,  the  other  quantities  in 
(14)  describing  the  log-normal  shadowing  model,  including 
Pn.(do),  do  and  Pc,  are  assigned  proper  values  such  that 
r  equals  1  according  to  (15).  Equivalently,  d  and  resulting 
position  estimation  errors  are  normalized. 

Furthermore,  A  takes  values  of  4,6,8,10  and  a  values  of 
0, 8.  For  each  pair  of  A  and  a,  100  independent  runs  are 
carried  out  and  each  run  involves  three  steps:  generating  a 
wireless  sensor  network  by  a  homogeneous  Poisson  process  of 
density  A;  estimating  distances  between  any  pair  of  neighbor¬ 
ing  nodes  by  using  the  proposed  method;  localizing  sensors 
by  DV-hop  and  MDS-MAP  and  the  distance-based  versions 
(with  distances  coming  from  the  second  step),  and  computing 
the  average  position  estimation  error  for  each  localization 
algorithm.  Finally,  the  average  position  estimation  errors  are 
plotted,  averaged  over  the  100  independent  runs  corresponding 
to  each  pair  of  A  and  a  and  each  localization  algorithm. 

Simulation  results  plotted  in  Fig.  3  show  that,  except  the 
case  of  a  =  0  where  DV-distance  marginally  outperforms 
DV-hop,  both  DV-distance  and  MDS-MAP  distance  using  the 
distance  estimates  from  the  proposed  method  dramatically 
outperform  the  corresponding  DV-hop  and  MDS-MAP,  and 
the  reduction  in  the  average  position  estimation  errors  is  at 
least  30%.  Moreover,  A  plays  an  important  role  in  applying 
distance  estimates  produced  by  our  method  in  the  localization. 
In  particular,  when  A  rising  from  4  to  6  the  average  position  es¬ 
timation  errors  by  using  DV-distance  and  MDS-MAP  distance 
are  significantly  reduced;  when  A  rises  further  the  average 
position  estimation  errors  decrease  slowly  or  even  increase. 

In  summary,  our  distance  estimation  method  makes  good 
use  of  connectivity  in  wireless  sensor  networks  and  benefits 
the  sensor  network  localization. 

V.  Conclusions 

In  this  paper,  we  proposed  a  method  of  estimating  dis¬ 
tances  via  connectivity  in  wireless  sensor  networks  by  dealing 
with  the  ideal  unit  disk  model  and  the  more  realistic  log¬ 
normal  shadowing  model.  Simulation  results  showed  that 
using  the  distance  estimates  produced  by  this  method  signif¬ 
icantly  improves  localization  accuracies  in  comparison  with 
connectivity-based  localization  algorithms.  Future  work  will 
focus  on  analyzing  its  performance,  advancing  its  practicality 
by  relaxing  some  assumptions  and  improving  its  accuracy  by 
utilizing  more  information. 
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Fig.  3.  Average  position  estimation  errors. 
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ABSTRACT 

Distance  estimation  is  vital  for  localization  and  many  other  applications  in  wireless  sensor  networks.  In  this  paper, 
we  develop  a  method  that  employs  a  maximum-likelihood  estimator  to  estimate  distances  between  a  pair  of  neighbor¬ 
ing  nodes  in  a  static  wireless  sensor  network  using  their  local  connectivity  information,  namely  the  numbers  of  their 
common  and  non-common  one-hop  neighbors.  We  present  the  distance  estimation  method  under  a  generic  channel 
model,  including  the  unit  disk  (communication)  model  and  the  more  realistic  log-normal  (shadowing)  model  as  special 
cases.  Under  the  log-normal  model,  we  investigate  the  impact  of  the  log-normal  model  uncertainty;  we  numerically 
evaluate  the  bias  and  standard  deviation  associated  with  our  method,  which  show  that  for  long  distances  our  method 
outperforms  the  method  based  on  received  signal  strength;  and  we  provide  a  Cramer-Rao  lower  bound  analysis  for  the 
problem  of  estimating  distances  via  connectivity  and  derive  helpful  guidelines  for  implementing  our  method.  Finally,  on 
implementing  the  proposed  method  on  the  basis  of  measurement  data  from  a  realistic  environment  and  applying  it  in 
connectivity-based  sensor  localization,  the  advantages  of  the  proposed  method  are  confirmed.  Copyright  ©  2012  John 
Wiley  &  Sons,  Ltd. 
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1.  INTRODUCTION 

Wireless  sensor  networks,  composed  of  hundreds  or  thou¬ 
sands  of  small  and  inexpensive  nodes  with  constrained 
computing  power,  limited  memory,  and  short  battery  life¬ 
time,  can  be  used  to  monitor  and  collect  data  in  a 
region  of  interest.  Accurate  and  low-cost  node  localiza¬ 
tion  is  important  for  various  applications  in  wireless  sen¬ 
sor  networks,  and  thus,  great  efforts  have  been  devoted 
to  developing  localization  algorithms,  including  distance- 
based  algorithms  and  connectivity-based  algorithms  [1], 
Distance-based  localization  algorithms  rely  on  distance 
estimates  and  can  achieve  relatively  good  localization 
accuracy;  if  distance  estimates  are  unavailable  or  suffer 
from  huge  errors,  connectivity-based  localization  algo¬ 
rithms  are  applied  but  generally  achieve  coarse-grained 
localization  accuracy.  Besides,  distance  estimation  is  vital 
for  sensor  network  management,  such  as  topology  control 
[2,3]  and  boundary  detection  [4], 


In  reality,  distance  estimation  can  be  realized  by  using 
information  such  as  received  signal  strength  (RSS),  time 
of  arrival  (TOA)  and  time  difference  of  arrival  (TDOA) 
[1],  The  RSS  method  (using  RSS  measurements)  depends 
on  low-cost  hardware  and  only  provides  coarse-grained 
distance  estimates;  by  contrast,  the  TOA  and  TDOA  meth¬ 
ods  can  provide  distance  estimates  with  higher  accu¬ 
racy  at  the  cost  of  extra  hardware,  but  because  of  cost 
constraints,  it  is  impractical  to  equip  all  sensors  in  a 
large-scale  sensor  network  with  extra  hardware  to  obtain 
accurate  distance  estimates  and  thus  accurate  location  esti¬ 
mates.  Further,  although  a  number  of  connectivity-based 
localization  algorithms  have  been  proposed,  for  exam¬ 
ple,  see  [5-8],  achieving  high  localization  accuracy  usu¬ 
ally  demands  a  comparatively  large  number  of  anchor 
nodes,  hereafter  termed  simply  anchors,  whose  positions 
are  known  a  priori  (accordingly,  we  term  other  nodes 
whose  positions  are  not  known  and  need  to  be  deter¬ 
mined  as  sensors).  In  this  paper,  we  shall  propose  an 


Copyright  ©  2012  John  Wiley  &  Sons,  Ltd. 


Estimating  distances  via  connectivity  in  wireless  sensor  networks 


B.  Huang  et  al. 


attractive  distance  estimation  method  that  does  not  rely 
on  extra  hardware  but  provides  comparatively  accurate 
distance  estimates. 

In  a  static  wireless  sensor  network,  two  nodes  are 
termed  one-hop  or  immediate  neighbors  as  long  as  they 
can  directly  communicate  with  each  other.  An  intuitive 
observation  shows  that  with  a  higher  probability,  two  geo¬ 
graphically  close  nodes  share  more  common  immediate 
neighbors  than  two  distant  nodes.  We  quantify  and  exploit 
this  observation  to  develop  a  maximum-likelihood  estima¬ 
tor  (MLE)  for  estimating  the  distance  between  any  pair  of 
neighboring  nodes  on  the  basis  of  their  local  connectivity 
information.  Herein,  local  connectivity  information  refers 
to  the  numbers  of  common  and  non-common  immediate 
neighbors  associated  with  a  pair  of  neighboring  nodes. 
Because  only  elementary  computations  and  local  connec¬ 
tivity  information  are  involved  on  each  node,  the  proposed 
method  is  energy  efficient  and  totally  distributed. 

In  this  paper,  we  present  the  distance  estimation  method 
under  a  generic  channel  model,  including  the  ideal  unit 
disk  (communication)  model  and  the  more  realistic  log¬ 
normal  (shadowing)  model  as  special  cases  (see  Section  2 
for  further  details).  Then,  we  take  the  log-normal  model  for 
example  to  demonstrate  the  proposed  method:  the  impact 
of  uncertainties  in  the  log-normal  model  is  examined; 
the  bias  and  standard  deviation  of  derived  distance  esti¬ 
mates  are  numerically  evaluated;  the  proposed  method, 
although  not  comparable  with  such  fine-grained  distance 
estimation  techniques  as  TOA  and  TDOA,  outperforms 
the  well-known  RSS  method  for  long  distances;  the  influ¬ 
ences  of  various  factors  on  the  problem  of  estimating 
distances  via  connectivity  are  analyzed  on  the  basis  of 
the  Cramer-Rao  lower  bound  (CRLB),  and  useful  guide¬ 
lines  for  implementing  the  proposed  method  in  reality  are 
also  derived;  and  finally,  on  implementing  the  proposed 
method  on  the  basis  of  measurement  data  in  a  realistic  envi¬ 
ronment  and  also  applying  it  in  connectivity-based  sen¬ 
sor  localization,  the  advantages  of  the  proposed  method 
are  confirmed. 

Prior  to  our  work,  [9,10]  came  up  with  the  methods 
of  estimating  distances  on  the  basis  of  the  same  idea  as 
ours.  The  neighborhood  intersection  distance  estimation 
scheme  (NIDES)  presented  in  [9]  heuristically  relates  the 
distance,  for  example,  from  node  A  to  node  B,  to  an  eas¬ 
ily  observed  ratio,  that  is,  the  number  of  their  common 
immediate  neighbors  to  the  number  of  immediate  neigh¬ 
bors  of  A,  and  then  performs  the  distance  estimation  at 
node  A  using  this  ratio  and  other  a  priori  known  infor¬ 
mation.  The  NIDES  assumes  the  unit  disk  model  and  uni¬ 
formly  and  randomly  deployed  wireless  sensor  networks. 
Its  enhanced  version  presented  in  [10]  adapted  the  ratio  by 
taking  into  account  the  number  of  immediate  neighbors  of 
node  B  and  heuristically  stated  that  the  NIDES  could  be 
applied  in  arbitrary  radio  models.  Although  it  turns  out 
that  the  enhanced  NIDES  leads  to  the  same  solution  as 
ours,  their  entire  treatment  rests  on  empirical  observations 
and  heuristic  formulations  rather  than  theoretical  founda¬ 
tions.  In  comparison  with  their  work,  the  contributions  of 


this  paper  are  as  follows:  (i)  a  statistical  model  is  formally 
established  for  the  distance  estimation  problem,  and  an 
MLE  solution  with  mathematical  proofs  of  the  correctness 
is  provided;  (ii)  the  problem  is  considered  under  a  generic 
channel  model  widely  used  in  the  literature,  including  the 
more  realistic  log-normal  model;  (iii)  the  performance  of 
the  proposed  method  is  comprehensively  analyzed  under 
the  log-normal  model  in  terms  of  model  uncertainties,  bias, 
standard  deviation  and  root  mean  square  error  (RMSE); 
(iv)  a  CRLB  analysis  is  carried  out  for  the  problem  of 
estimating  distances  via  connectivity;  and  (v)  it  is  shown 
that  the  proposed  method  contributes  to  the  quality  of 
connectivity-based  sensor  localization. 

The  remainder  of  the  paper  is  organized  as  follows. 
Section  2  introduces  the  network  model  and  the  (radio) 
channel  model.  Section  3  proposes  the  method  under 
a  generic  channel  model.  Under  the  log-normal  model, 
Section  4  analyzes  the  performance  of  the  proposed 
method;  Section  5  provides  a  CRLB  analysis  for  the 
general  distance  estimation  problem  using  connectivity; 
Section  6  implements  the  proposed  method  using  the  mea¬ 
surement  data  from  a  real  environment;  Section  7  reports 
the  contributions  of  the  proposed  method  to  connectivity- 
based  sensor  localization  by  simulations.  Finally,  we  con¬ 
clude  the  paper  in  Section  8. 

2.  SYSTEM  MODEL  DESCRIPTION 

This  section  briefly  introduces  the  system  model  we  shall 
use,  including  the  network  model  and  the  channel  model. 
Throughout  this  paper,  we  shall  use  the  following  mathe¬ 
matical  notations:  Pr{-}  denotes  the  probability  of  an  event, 
and  E  (•)  denotes  the  statistical  expectation. 

2.1.  Network  model 

In  static  wireless  sensor  networks,  nodes  are  often  assumed 
to  be  randomly  and  uniformly  distributed  on  account 
of  the  random  nature  of  the  network  deployment.  A 
homogeneous  Poisson  process  provides  an  accurate  model 
for  a  uniform  distribution  of  nodes  as  the  network 
size  approaches  infinity.  Therefore,  we  consider  a  static 
wireless  sensor  network  that  is  deployed  over  an  infinite 
plane  according  to  a  homogeneous  Poisson  process  of 
intensity  X. 

2.2.  Channel  model 

Let  Pj  be  the  transmitted  signal  power  by  a  transmitter  and 
Pr(c/)  be  the  received  signal  power  by  a  receiver  located 
at  distance  d  from  the  transmitter.  According  to  [11], 
the  log-normal  model  predicts  Pvt(d)  to  be  log-normally 
distributed  and  is  typically  modeled  as  follows: 

PR(fiO(dBm)  =  PR(t/0)(dBm)  -  10alog10  +  Z  (1) 

do 
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Ps.(d)  = 


t2GtGrPt 
(4  n)2da 


(2) 


where  Pr(£5?o)  (dBm)  is  the  mean  received  signal  power 
in  dB  m  at  a  reference  distance  do,  a  is  the  path  loss 
exponent,  Gt  and  Gr  are  the  transmitter  and  receiver 
antenna  gains,  r  is  the  wavelength  of  the  propagating  sig¬ 
nal  in  meters,  and  Z  is  a  random  variable  representing  the 
shadowing  effect,  normally  distributed  with  mean  zero  and 
variance  °d2B  .  Typically,  a  can  vary  between  2  in  free  space 
and  6  in  heavily  built  urban  areas,  and  OdB  is  as  low  as  4 
and  as  high  as  12  according  to  [11]. 

If  PR(d)  is  above  some  specified  value  Pc,  the  receiver 
is  able  to  communicate  with  the  transmitter.  Particularly, 
if  ffdB  =  0,  the  log-normal  model  is  equivalent  to  the  unit 
disk  model  with  the  transmission  range 


Assumption  2.  Communication  links  are  symmetric, 
namely  that  node  A  can  directly  receive  packets  from  node 
B  as  long  as  node  B  can  directly  receive  packets  from 
node  A. 

Even  though  field  measurements  in  real  applications 
seem  to  indicate  that  the  attenuations  between  two  links 
with  a  common  node  are  correlated  [16],  Assumption  1  is 
generally  considered  appropriate  for  far  field  transmission 
and  is  widely  used  in  the  literature  [14,16-20].  Although 
the  above  assumptions  may  not  fully  reflect  a  real  network 
environment,  they  still  enable  us  to  obtain  some  results  as 
estimates  for  more  realistic  situations. 

3.  THE  DISTANCE 
ESTIMATION  METHOD 


/  t2GtGrPt\“ 
V  (4jr)2Pc  ) 


(3) 


Hence,  under  the  unit  disk  model,  the  communication  cov¬ 
erage  of  each  node  is  a  perfect  disk  of  radius  r .  In  further 
discussions,  r  is  not  limited  to  be  the  transmission  range 
of  the  unit  disk  model  but  a  generalized  parameter  defined 
by  (3). 

In  effect,  the  randomness  on  the  received  signal  power 
Pr(£j[)  can  be  described  by  a  function  g(d)  denoting  the 
probability  that  a  directional  communication  link  exists 
from  transmitter  to  receiver  with  distance  d.  On  the  basis 
of  g(d),  a  generic  channel  model  can  be  defined  once  g(d) 
satisfies  the  following  restrictions: 


'  g(di)  =  g(di),  if  d]  =  d2  (4) 

g(di)<g(d2),  if  d\  >  d2  (5) 

/OO  POO  I - 

/  g(y  x2  +  y2)dxdy  <  oo  (6) 

-OO  V— oo 

The  generic  channel  model  has  been  treated  intensively 
in  percolation  theory  [12,13].  The  first  restriction  indicates 
that  the  propagation  path  is  symmetric;  the  second  one 
indicates  that  g{d)  must  be  a  non-increasing  function  of 
d ;  and  the  third  one  avoids  the  trivial  cases  that  any  two 
nodes  are  directly  connected  with  probability  1  and  that 
any  two  nodes  are  isolated  with  probability  1.  It  can  be  eas¬ 
ily  shown  that  both  the  unit  disk  model  and  the  log-normal 
model  satisfy  these  restrictions  [14]. 

As  transmit  power  of  each  node  is  actually  tunable  in 
many  wireless  sensor  networks  [15],  we  require  that  dur¬ 
ing  the  period  of  running  the  proposed  distance  estima¬ 
tion  method,  all  nodes  transmit  at  a  common  power  level, 
that  is,  Pt-  Furthermore,  throughout  the  paper,  we  make 
the  following  assumptions  (as  is  commonly  the  case  in 
the  literature). 


Assumption  1.  The  attenuations  caused  by  shadowing 
effects  (i.e.,  Z )  between  any  pairs  of  nodes  are  independent 
and  identically  distributed. 


In  this  section,  we  present  the  method  of  estimating 
distances  via  connectivity  and  detail  its  implementation 
under  the  log-normal  model. 

3.1.  Estimating  distances  under  the  unit 
disk  model 

In  a  static  wireless  sensor  network,  given  two  nodes  A  and 
B  with  coordinates  (xa,  Ta)  and  (xr,  Tr),  their  distance  is 
defined  to  be  d  ( d  <  r),  and  two  disks  with  the  same  radius 
r  represent  their  individual  communication  coverage  under 
the  unit  disk  model,  as  shown  in  Figure  1.  Because  of 
d  <  r,  the  two  disks  intersect  and  create  three  disjoint 
regions.  Regarding  r  as  a  constant,  we  define  S  =  nr2  and 
f  (d)  to  be  the  area  of  the  middle  region  in  Figure  1 ,  where 


2  s  d  d2 

f(d)=—arccos(—)-dJr2 - -  (7) 

n  2  /•  y  4 

It  is  obvious  that  the  nodes  residing  in  the  middle  region 
are  common  immediate  neighbors  of  A  and  B,  the  nodes 
residing  in  the  left  (right)  one  are  non-common  imme¬ 
diate  neighbors  of  A  (B).  Define  three  random  variables 
M .  P ,  and  Q  to  be  the  numbers  of  the  three  categories 
of  neighbors.  Obviously,  they  are  mutually  independent 


Figure  1.  The  communication  coverage  of  two  nodes  under  the 
unit  disk  model. 
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and  Poisson  with  means  A  f  ( d ),  X(S  —  f  (d)),  and  A (S  — 
f  (d)),  as  pointed  out  in  [13].  The  actual  values  of  M ,  P, 
and  Q  can  be  easily  obtained  after  A  and  B  exchange  their 
neighborhood  information.  On  the  basis  of  the  observations 
of  M ,  P ,  and  Q ,  an  MLE  for  estimating  d  is  summarized 
as  follows: 

Theorem  1.  M ,  P,  and  Q  are  mutually  independent 
Poisson  random  variables  with  means  Xf(d),X(S  — 
f  (d)),  and X(S  —  f  (cl)),  respectively.  If  f  (d)  is  invertible 
and  S  is  a  non-zero  constant,  then  the  MLE  for  d,  termed 
d,  is 


^  =  j/-1(5),  if  M  =  P  =  Q  =  0;  (8) 

(  f~l(pS),  otherwise  (9) 

where  p  =  2M /(2M  +  P  +  Q). 

Proof.  See  Appendix  A.  □ 

Note  that  the  actual  value  of  A  is  not  needed  in  obtain¬ 
ing  d.  Although  nodes  are  assumed  to  follow  a  random 
and  uniform  distribution  of  density  A  (as  a  result  of  the 
Poisson  point  process),  the  derivation  of  the  MLE  indi¬ 
cates  that  as  long  as  nodes  in  a  local  region  that  covers 
the  communication  coverage  of  two  neighboring  nodes 
admits  a  uniform  density,  the  proposed  method  is  reason¬ 
ably  applicable.  Moreover,  in  some  applications,  nodes 
may  be  placed  on  the  basis  of  certain  regular  structures 
but  with  noises.  For  example,  in  a  two-dimensional  sensor 
network,  the  x -coordinate  and  y  -coordinate  of  each  node 
are  Gaussian  with  same  variance  and  mean  values  at  one 
grid  point.  Compared  with  a  random  and  uniform  distri¬ 
bution,  such  distribution  is  even  closer  to  be  uniform,  and 
the  proposed  method  can  thus  attain  better  performance.  In 
addition,  if  the  least  squares  method  instead  of  the  MLE 
is  applied  here,  the  resulting  expression  of  the  distance 
estimator  is  actually  the  same  as  that  in  Theorem  1 . 


3.2.  Extension  under  the  generic 
channel  model 


Under  the  generic  channel  model  defined  by  g(d),  M ,  P, 
and  Q  continue  to  denote  the  numbers  of  common  and 
non-common  immediate  neighbors  associated  with  two 
nodes.  First,  we  can  compute  their  expectations  as  follows: 


/oo  /*oo 

/ 

-oo  J  — oo 

X  g(sj (x  -  Xb)2  +  (.V  -  >’B)2)d.vdy  (10) 


/oo  poo  ! - 

/  g(\j(x-  xA)2  +  (y  -  >’a)2 ) 

-oo  J —oo 

x  g(\J ( x  -  XB)2  +  (y  -  VB)2)dx-d>’  (11) 


Then,  from  the  third  restriction  on  g(d),  that  is,  (6),  it  fol¬ 
lows  that  E(M)  <  E(M  +  P)  <  oo.  Unlike  the  unit  disk 
model  where  the  independence  among  M .  P ,  and  Q  is 
straightforward  because  of  having  three  disjoint  regions, 
the  generic  channel  model  does  not  necessarily  lead  to 
such  three  disjoint  regions,  while  the  following  theorem 
guarantees  the  mutual  independence. 

Theorem  2.  Suppose  a  static  wireless  sensor  network  is 
deployed  in  an  infinite  plane  according  to  a  homogeneous 
Poisson  process  of  density  A  and  conforms  to  the  generic 
channel  model  defined  by  g(d);  given  two  nodes  in  this 
wireless  sensor  network,  let  M  be  the  number  of  their  com¬ 
mon  immediate  neighbors  and  P  and  Q  be  the  numbers  of 
their  non-common  immediate  neighbors.  Then,  M ,  P ,  and 
Q  are  mutually  independent  Poisson  random  variables. 

Proof.  See  Appendix  B.  □ 

Under  the  generic  channel  model,  S  and  f  (d  )  are  gener¬ 
alized  to  specify  the  expectations  of  M .  P  and  Q,  instead 
of  the  areas  defined  under  the  unit  disk  model,  and  have  the 
forms  of 

/OO  P  OO  j - 

/  g(^(x-xB)2  +  (y-yB)2)dxdy  (12) 

-OO  J—O O 


/OO  P  OO  I - 

/  g{\](x-xA)2 +  (y-yA)2) 

-oo  J—O o 

X  g( \](x-xB)2  +  (y  -  VB)2)dxdy  (13) 

Therefore,  if  S  and  f(d)  satisfy  the  conditions  in 
Theorem  1,  the  MLE  is  directly  applicable. 

In  reality,  however,  sensor  networks  are  deployed  in 
regions  of  finite  areas,  and  thus,  the  expectations  of  M ,  P, 
and  Q  associated  with  two  nodes,  especially  those  near 
network  boundaries,  cannot  be  derived  by  simply  integrat¬ 
ing  over  an  infinite  plane  to  compute  S  and  f(d).  This 
is  termed  boundary  effects.  In  this  study,  we  concentrate 
on  the  theoretical  foundations  of  the  proposed  method  and 
will  tackle  boundary  effects  in  our  future  work. 

Prior  to  implementing  the  proposed  method  in  a  static 
wireless  sensor  network,  it  is  a  premise  to  know  the  wire¬ 
less  channel,  that  is,  g(d),  such  that  the  quantity  S,  the 
function  / (d),  and  its  inverse  can  be  determined  and  then 
programmed  into  each  node.  After  that,  because  of  the 
simple  mechanism  of  the  proposed  method,  a  distributed 
protocol  can  be  easily  designed  for  collecting  and  exchang¬ 
ing  local  connectivity  information  by  each  node  through 
broadcasting  operations.  Once  a  node  obtains  neighboring 
information  of  all  its  immediate  neighbors,  this  node  is 
able  to  estimate  the  distances  from  its  immediate  neighbors 
using  the  inverse  of  /  (d),  S,  and  the  MLE  in  Theorem  1. 

If  Assumption  2  holds  in  a  sensor  network,  each  pair 
of  neighboring  nodes  will  have  identical  information  for 
estimating  their  distance  and  thus  will  obtain  the  same  dis¬ 
tance  estimate;  otherwise,  provided  that  A  can  hear  B  but 
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B  cannot  hear  A,  A  will  estimate  their  distance,  whereas 
B  will  not.  Such  asymmetry  in  distance  estimation  can 
be  alleviated  by  allowing  each  node  to  exchange  two-hop 
neighborhood  information  with  its  immediate  neighbors. 

Clearly,  if  the  size  of  each  sensor’s  neighborhood  has  the 
magnitude  of  0(1),  the  complexities  for  communications 
and  computations  of  running  the  proposed  method  in  this 
network  are  both  0(n),  where  n  is  the  number  of  nodes 
in  a  wireless  sensor  network,  implying  that  the  proposed 
method  is  efficient  and  scalable. 

3.3.  Implementation  under  the 
log-normal  model 

Provided  that  the  wireless  channel  is  known  to  be  log¬ 
normal  with  known  parameters  r,  o^b,  and  a,  we  shall 
demonstrate  how  to  determine  g(d),  S,  and  f  (d)  involved 
in  the  proposed  method. 

3.3.1.  Formulating  g(d ) 

Under  the  log-normal  model,  for  a  transmitter  and 
receiver  pair  with  distance  d,  if  the  received  signal  power 
described  by  (1)  is  no  less  than  Pc,  a  bi-directional  com¬ 
munication  link  exists  between  them  (as  a  result  of  the 
symmetry  in  Assumption  2).  The  probability  that  the  two 
nodes  are  able  to  communicate  with  each  other,  that  is, 
g(d),  is 


where  k  =  lOa/log  10  and  log  denotes  natural  logarithm. 
Evidently,  g(d )  is  determined  by  r,  a,  and  tr^B ,  where 
r  can  be  easily  computed  given  the  parameters  in  (3), 
and  a  and  o^B  can  be  derived  by  measurements  obtained 
prior  to  the  deployment  of  sensor  networks  or  empirically 
assigned  on  the  basis  of  the  characteristics  of  the  deploy¬ 
ment  environment  [11],  Alternatively,  using  the  technique 


presented  in  [21],  a  and  o^b  can  be  estimated  through  pro¬ 
cessing  of  RSS  measurements  (and  no  distance  measure¬ 
ments),  and  depending  on  the  level  of  noise  and  amount  of 
measurement  data,  they  may  result  in  imprecise  a  and  o^b  ■ 

We  plot  g(d)  with  respect  to  d  and  o^b  given  a  =  4 
and  r  =  1  in  Figure  2(a).  It  can  be  seen  that  the  smaller  is 
d ,  the  higher  is  the  probability  that  a  communication  link 
exists;  a  larger  ctub  tends  to  inhibit  communications  for  a 
smaller  d  but  promotes  communications  for  a  larger  d  in 
comparison  with  a  smaller  o^b  ■ 

In  view  of  the  restrictions  on  g(d),  it  is  straightforward 
to  obtain  lim^^.00g(£/)  =  0;  as  such,  for  an  extremely 
small  and  positive  e,  there  exists  dt h  such  that  g(d)  <  s 
if  d  >  dth-  That  is  to  say,  nodes  with  distances  to  a 
node  longer  than  dt h  hardly  communicate  with  this  node 
directly;  as  such,  d±  is  a  surrogate  of  the  transmission 
range.  This  phenomenon  can  be  observed  in  Figure  2(a). 

3.3.2.  Formulating  S  and  f(d ) 

In  [17,19],  the  expectations  E(M  +  P)  (or  E(M  +  Q )) 
has  been  well  studied.  Thus,  we  can  have 

S  =  nr2e  *2  (15) 

By  (11),  (13),  and  (14),  we  can  derive  the  formula  for 
f  (d)  under  the  log-normal  model.  By  letting  a  =  4  and 
r  =  1,  we  plot  f  (d)  with  respect  to  different  values  of  d 
and  tTd b  in  Figure  2(b). 

As  can  be  seen  in  Figure  2(b),  f  (d )  is  monotonically 
decreasing  and  invertible;  hence.  Theorem  1  is  applicable 
under  the  log-normal  model.  But  the  closed-form  formula 
for  /  (d )  and  its  inverse  are  unavailable.  Alternatively,  we 
can  establish  a  piecewise  linear  function  to  approximate 
its  inverse;  for  each  affine  segment,  a  linear  regression 
model  is  applied  to  predict  d .  Considering  the  fact  that  two 
nodes  with  distance  longer  than  d th  hardly  communicate 
with  each  other  directly,  we  restrict  the  distance  estimates 
to  be  between  0  and  dt h-  But  in  a  real  estimation  process, 
pS  and  S  may  exceed  [/ (fi?th).  /  (0)]  and  consequently  d 
may  exceed  [0,  z/th] .  Therefore,  we  can  obtain  the  distance 
estimator  as  follows: 


Figure  2.  The  functions  g(d)  and  f(d)  under  the  log-normal  model  with  a  =  4  and  r  =  1 . 
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(0,  if  M  =  P  =  Q  =  0  or  pS  >  f  (0); 

f~\pS ),  if  /(<4)  <  pS  <  /(0); 
t/th.  if  P-S  <  /(<*th) 

(16) 


4.  PERFORMANCE  ANALYSIS 

In  this  section,  we  evaluate  the  performance  of  the 
proposed  method  under  the  log-normal  model  from  differ¬ 
ent  respects. 

4.1.  Impact  of  imprecise  a  and  o-jb 

In  the  proposed  method,  the  parameters  of  the  log-normal 
model,  that  is,  a  and  a^B,  are  supposed  to  be  known  pre¬ 
cisely.  But  what  if  their  values  are  imprecise?  To  answer 
this  question,  we  define 

,  fid)  nl. 

Pa,odB(“ )  —  (17) 


large  a;  in  other  words,  for  a  small  a,  using  an  imprecise 
version  of  a  tends  to  degrade  the  accuracy  of  the  distance 
estimate  d  more  seriously  than  for  a  large  a.  Moreover,  if 
a  is  overestimated,  then  an  underestimated  d  will  be  pro¬ 
duced  for  a  small  d  but  an  overestimated  d  for  a  large 
d,  and  vice  versa.  However,  PaL,B(p)  does  not  demon¬ 
strate  the  same  sensitivity  to  a, jb  as  is  observed  for  a,  as 
illustrated  in  Figure  3(b).  We  can  conclude  that  if  CTdB  is 
overestimated,  then  an  overestimated  d  will  be  produced 
for  a  small  d  but  an  underestimated  d  for  a  large  d,  and 
vice  versa. 


4.2.  Bias  and  standard  deviation 

According  to  Theorem  1,  all  possible  values  of  p  are  ratio¬ 
nal  numbers  within  [0.  1]  so  that  d  is  a  discrete  random 
variable  and  its  j  th  moment  is  as  follows: 

E(di )  =  | 'dJ  Pr (d  =  a)J  (19) 

a 


where  f  (d)  and  S  are  computed  using  (13),  (15),  and 
(14)  given  a  and  o^b-  Thus,  the  distance  estimator  in  (16) 
is  equivalently 


d  = 


0,  if  M  =  P  =  Q  =  0  or  p>  Pa.odB  (°); 

Pa.ffdB  (P)’  Pa,CT dB  (^th)  Pa,(JdB  (®)> 

tfth.  if  P  <  Pa,(7dB  (^th) 


(18) 


We  divide  the  range  of  d,  that  is,  [0.  t/th],  into  w  equal 

intervals:  T\  =  [zo.zi) . Tw  =  zw]  with 

z;  =  (i  x  dth)/w.  Given  a  sufficient  large  w,  d  is  approx¬ 
imately  constant  over  each  interval,  denoted  d[ .  Then  we 
can  approximately  reformulate  (19)  as 


w 

E  (dJ )  «  J2  V  Pr^  G  T> )]  (20) 

i  —  1 


Evidently,  using  imprecise  a  and/or  OdB  results  in  an 
incorrect  function  p~ ^(p),  with  the  result  that  the  accu¬ 
racy  of  the  distance  estimate  d  is  degraded.  We  plot  the 
function  p“  (p)  with  respect  to  different  values  of  a  and 
OdB  in  Figure  3. 

Supposing  (Tub  is  known  to  be  exactly  4,  we  investigate 
the  impact  of  the  uncertainty  in  a.  As  shown  in  Figure  3(a), 
Pu]jdB(p)  is  much  more  sensitive  to  a  small  a  than  to  a 


Toward  the  probability  associated  with  the  i  th  interval 
Ti ,  we  have 


Pr(  J  eli)  = 


Pr( / (zi)  <  pS  <  5)),  if  i  =  1; 
Pr(/(z0  <pS  <  /(zj_i)),  if  1<  /  <w; 
Pr(0  <  pS  <  f(zw-i)),  if  i=w 


(a)  adB  =  4  (b)  a  =  4 

Figure  3.  The  inverse  function  of  pa,odB(,d)  under  the  log-normal  model. 
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Table  I.  The  values  of  A  with  respect  to  different  <idB  and  /x 
when  ce  =  4. 


OdB 

/x 

5 

10 

20 

30 

40 

0 

1.59 

3.18 

6.37 

9.55 

12.73 

4 

1.43 

2.86 

5.73 

8.59 

11.45 

8 

1.04 

2.08 

4.17 

6.25 

8.33 

12 

0.61 

1.23 

2.45 

3.68 

4.90 

By  letting  Y  =  P  +  Q,  we  have 

oo 

Pr(/t  <  pS  <  c)  =  <  PS  <  c\Y  =  y) 

y= 0 

x  Pr(T  =  >■)] 

which  makes  it  possible  for  us  to  numerically  evaluate  the 
moments  of  d  and  thus  the  bias  and  standard  deviation. 

Let  /x  be  the  expected  number  of  immediate  neighbors 
of  a  node,  namely  /x  =  E(M  +  P)  =  E(M  +  Q),  and  the 
values  of  A  with  respect  to  different  o^b  and  /x  are  listed 
in  Table  I.  For  better  presentation,  the  connectivity  index 
/x  will  be  used  in  the  following  discussions  instead  of  the 
node  density  A. 

Given  a  =  4,  r  =  1,  w  =  1000,  and  /x  varying  from 
5  to  40,  Figure  4  depicts  the  numerical  bias  and  standard 
deviation  associated  with  the  proposed  method  and  the  cor¬ 
responding  simulation  results.  The  two  groups  of  results 
are  highly  consistent,  and  the  comparatively  non-smooth 
aspect  of  some  of  the  curves,  for  example.  Figure  4(a), 
is  probably  attributable  to  the  fact  that  all  observa¬ 
tions  of  M ,  P ,  and  Q  in  related  results  are  necessarily 
integer,  and  such  observations  are  used  in  determining 
the  curves. 

As  shown  in  Figure  4,  the  proposed  method  is  obviously 
biased ,  but  the  absolute  bias  is  much  less  than  the  standard 
deviation  in  most  cases;  except  for  tTdB  =  0,  the  abso¬ 
lute  bias  and  standard  deviation  are  comparable  with  true 
distances,  especially  for  short  distances  and  sparse  sensor 
networks,  and  particularly,  when  p,  =  5,  their  values  are 
extraordinarily  large  and  nearly  twice  the  corresponding 
values  when  /x  =  10.  Moreover,  with  /x  increasing,  the 
standard  deviation  always  reduces,  while  the  absolute  bias 
reduces  in  most  cases.  An  intuitive  explanation  is  that  with 
/x  increasing,  the  variances  of  the  ratios  2M / E(2M)  and 
(2M+P+Q)/E(2M+P+Q)  both  decrease,  the  variance 
of  p  is  reduced  and  so  is  the  variance  of  d.  As  mentioned 
in  the  previous  section,  a  large  OdB  promotes  communica¬ 
tions  between  distant  nodes  but  inhibits  communications 
between  close  nodes;  as  a  result,  connectivity  is  related  to 
a  wide  range  of  distances  so  that  the  geometric  informa¬ 
tion  implied  by  connectivity  becomes  less  accurate.  Flence, 
the  larger  is  the  OdB.  the  worse  are  both  the  bias  and  the 
standard  deviation. 


4.3.  Root  mean  square  error 

As  a  performance  measure,  the  RMSE  is  defined  to  be  the 
square  root  of  the  sum  of  the  square  bias  and  variance  of 
estimation  errors.  We  plot  the  RMSE  of  d  produced  by  the 
proposed  method  in  Figure  5.  As  can  be  seen,  the  RMSE 
decreases  with  /x  increasing  and  b  decreasing,  which  is 
consistent  with  how  the  bias  and  standard  deviation  of  d 
depend  on  /x  and  o^b-  When  d  is  near  0,  the  RMSE  is 
extraordinarily  large  compared  with  the  true  value  of  d, 
implying  that  the  proposed  method  fails  to  provide  rea¬ 
sonable  estimates  for  short  distances.  This  underperfor¬ 
mance  with  short  distances  limits  the  use  of  the  proposed 
method  in  practice  and  is  due  to  a  mixed  impact  of  the 
following  facts: 


•  It  is  evident  that  the  variance  of  2 M  +  P  +  Q  is  con¬ 
stant  no  matter  what  d  is,  but  the  variance  of  2 M 
increases  with  d  decreasing;  as  a  result,  p,  that  is, 
2M /(2M  +  P  +  Q),  is  more  likely  to  suffer  bigger 
variances  when  d  is  small  than  when  d  is  large. 

•  As  depicted  in  Figure  3,  p~  Jj.  (p)  is  quite  sensitive  to 
p  when  d  is  small,  namely  that  a  small  perturbation  in 
p  leads  to  a  big  change  in  d  and  thus  a  big  distance 
estimation  error. 

•  In  light  of  (18),  d  is  roughly  set  0  when  p  is  greater 
than  Pa,aga(P)<  but  a  small  d  often  causes  p  to 
be  within  [pu,<TdB (0),  1],  and  hence  the  underperfor¬ 
mance  is  attained. 


To  conclude,  for  short  distances,  the  non-smooth  aspect 
and  the  sensitivity  to  p  of  the  function  defined  in  (18)  are 
responsible  for  the  underperformance. 

Under  the  log-normal  model  with  o^B  >  0,  distance  esti¬ 
mation  can  be  realized  by  using  the  RSS  measurements, 
that  is,  received  signal  powers.  The  bias  and  variance  of 
the  resulting  distance  estimate  (denoted  c?rss)  are  provided 
in  [22]  so  that  we  can  compute  the  RMSE  of  £(rss  and 
compare  it  with  that  of  d  in  Figure  5.  It  can  be  seen  that 
(i)  the  RMSE  of  Jrss  increases  in  direct  proportion  to  d, 
but  that  of  d  appears  to  have  comparatively  small  vari¬ 
ations  with  d  increasing,  and  (ii)  the  proposed  method 
outperforms  the  RSS  method  for  long  distances  by  a 
large  margin. 


5.  ANALYSIS  BASED  ON  THE 
CRAMER-RAO  LOWER  BOUND 

In  this  section,  we  formulate  the  CRLB  regarding  the 
distance  estimation  problem  via  connectivity,  that  is,  esti¬ 
mating  d  from  M ,  P,  and  Q ,  under  the  log-normal  model. 
For  this  estimation  problem,  the  unknown  parameters  are 
d  and  A.  The  Fisher  Information  Matrix  (FIM)  for  this 
estimation  problem,  denoted  FIM(<i ,  A),  is 


Wirel.  Commun.  Mob.  Comput.  (2012)  ©  2012  John  Wiley  &  Sons,  Ltd. 
DOI:  10.1002/wcm 


Estimating  distances  via  connectivity  in  wireless  sensor  networks 


B.  Huang  et  al. 


(a)  Bias  ( adB  =  0,  dth  =  1) 


(c)  Bias  (adB  =  4,  dth  =  2) 


(e)  Bias  ( adB  =  8,  dth  =  4) 


(g)  Bias  ( adB  =  12,  dth  =  6) 


(b)  Standard  Deviation  (adB  =  0,  dth  =  1) 


(d)  Standard  Deviation  ( adB  =  4,  dth  =  2) 


(f)  Standard  Deviation  (adB  =  8,  dth  =  2) 


(h)  Standard  Deviation  (odB  =  12,  dth  =  6) 


Figure  4.  The  bias  and  standard  deviation  of  distance  estimation  from  numerical  evaluations  (solid  lines)  and  simulations  (dashed 
lines)  with  pt  =  5, 10, 20, 30,  and  40,  a  =  4,  and  r  =  1 .  For  the  standard  deviation,  a  larger  /x  corresponds  to  a  line  to  the  bottom;  for 

the  bias,  a  larger  pt  corresponds  to  a  line  with  the  bias  closer  to  0. 
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LU 

(f) 

cc 


(c)  (TdB  =  8,  dth  =  4 


(d)  a dB  =  12,  dth  =  6 


Figure  5.  The  RMSE  of  c/rss  (dashed  lines)  and  d  (solid  lines;  /a  =  10, 20,30,  and  40  with  a  =  4  and  r  =  1 ;  a  larger  /x  corresponds  to 

a  curve  to  the  bottom)  under  the  log-normal  model. 


FIM(rf.A) 


{f'(d))  (fld)  +  S-f(d)) 

-f'(d) 


-fid)  \ 

2 S-f(d)  I 

A  (22) 


where  /(d)  is  differentiable  to  the  first  order  in  (13)  (see 
Appendix  C).  Then,  the  CRLB  for  d  by  using  any  unbiased 
estimator,  denoted  CRLB(rf),  is 


deployed  in  high  density  and  may  take  turns  to  be  active 
in  order  to  prolong  the  network  lifetime  [23];  accordingly, 
many  scheduling  strategies  have  been  developed  to  deter¬ 
mine  when  and  which  sensors  should  be  powered  up  and 
which  sensors  should  be  put  into  energy  saving  mode  while 
satisfying  certain  coverage  and  connectivity  requirements 
[24-29], 


CRLB (rf)  = 


(S  -f(d))(2S  -f(d))f(d) 
2\S2(ff(d))2 


(23) 


Although  the  CRLB  is  only  valid  for  unbiased  distance 
estimates  and  the  proposed  method  is  known  to  be  biased, 
it  is  still  helpful  to  understand  the  essential  features  of 
the  distance  estimation  problem.  In  what  follows,  we  shall 
investigate  the  influences  of  various  parameters. 


5.1.  Influence  of  A 

It  is  clear  that  the  CRLB  is  inversely  proportional  to  A. 
In  other  words,  better  estimation  accuracy  can  be  attained 
in  dense  wireless  sensor  networks,  which  is  intuitive  and 
is  also  illustrated  in  Figure  5.  Hence,  it  is  attractive  to 
apply  the  proposed  method  in  dense  wireless  sensor  net¬ 
works.  Dense  wireless  sensor  networks,  however,  are  really 
required  in  some  circumstances.  For  example,  because  of 
the  limited  energy  resource  at  each  node,  nodes  are  usually 


5.2.  Influence  of  d 

According  to  (23),  it  is  difficult  to  directly  observe  the 
influence  of  d  on  the  CRLB,  for  we  do  not  have  the  closed- 
form  formulas  for  / (d)  and  f'(d)  except  for  the  case  of 
<7dB  =  0.  But  because  it  is  easily  justified  that  the  numera¬ 
tor  of  (23)  is  bounded  in  a  narrow  range,  if  the  denominator 
can  be  very  small,  the  CRLB  will  be  seriously  affected 
by  the  denominator.  On  the  basis  of  Figure  2(b),  we  can 
obtain  some  preliminary  knowledge  about  the  key  com¬ 
ponent  in  the  denominator,  that  is,  f'(d).  As  can  be  seen 
from  Figure  2(b),  with  d  increasing  from  0,  |  f'(d)\  firstly 
experiences  a  rise  and  then  decreases  after  d  is  greater 
than  some  value  that  differs  from  (Tag;  when  d  increases 
further,  \  f'(d)\  continuously  decreases  and  approaches  0. 
Hence,  it  is  postulated  that  the  CRLB  will  experience  a  rise 
with  d  increasing. 

As  shown  in  Figure  4,  in  the  cases  of  (t^b  =  8  and  12, 
the  standard  deviation  displays  an  evident  rise  when  d  is 
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larger  than  some  value  and  then  drops  when  d  approaches 
<ith-  The  reason  causing  such  a  drop  is  that  we  restrict  the 
maximal  distance  estimate  to  be  t/th  so  that  estimates  for  d 
near  dt h  are  improved.  In  the  case  of  <rdB  =  4,  \f'(d)\  is 
not  so  close  to  0  when  d  is  near  dt h  but  is  comparatively 
small  when  d  is  near  0;  as  a  result,  the  expecting  rise  does 
not  happen. 

5.3.  Influences  of  Pj  and  Pc 

Provided  that  Gj,  Gr,  and  a  are  known  in  (3),  r  is  propor¬ 
tional  to  (Pj/Pc)^ ,  where  Pj  is  the  transmission  power 
and  Pc  is  the  threshold  of  power  for  communications.  If 
both  Pj  and  Pc  are  tunable  in  wireless  sensor  networks 
(which  does  not  break  requirement  on  the  common  trans¬ 
mission  power  Pj),  it  will  be  meaningful  to  analyze  their 
influences  on  the  CRLB.  For  simplicity,  we  shall  use  r 
instead  of  Pj  and  PQ  to  carry  out  the  analysis.  To  do  so, 
we  derive  the  following  theorem. 

Theorem  3.  Consider  the  CRLB  that  is  defined  on  the 
basis  of  (13),  (15),  and  (23),  and  suppose  only  d  and 
r  are  variables.  Then,  the  CRLB  for  a  given  distance  d 
with  r  =  ro  is  equal  to  the  CRLB  for  a  distance  d/ro 
with  r  =  1. 

Proof.  See  Appendix  D.  □ 

This  theorem  reveals  that  (i)  the  CRLB  is  virtually  deter¬ 
mined  by  the  ratio  d  /r  and  (ii)  the  value  field  of  the  CRLB 
is  invariant  no  matter  how  large  r  is.  That  is,  if  the  ratio 
Pj /  Pc  is  raised,  distant  nodes  will  tend  to  become  con¬ 
nected  so  that  estimates  for  long  distances  will  be  available, 
but  the  CRLB  will  not  exceed  the  value  field  of  the  CRLB 
associated  with  the  original  small  value  of  Pj/ Pc.  Conse¬ 
quently,  estimates  for  long  distances  will  generally  suffer 
less  relative  errors  (i.e.,  the  ratio  of  the  estimation  error  to 
d)  than  those  for  short  distances. 

Notice  that  because  the  CRLB  is  not  monotonic  with 
d,  tuning  Pj/Pc  does  not  definitely  increase  or  decrease 
the  corresponding  CRLB  associated  with  one  given  value 
of  d.  Because  raising  Pj/Pc  results  in  more  imme¬ 
diate  neighbors  for  each  node  and  consequently  more 
distance  estimates,  although  any  distance  estimate  is 
not  necessarily  improved,  more  available  distance  esti¬ 
mates  will  benefit  other  applications,  for  example,  sensor 
network  localization. 

Furthermore,  this  feature  can  be  exploited  in  the  imple¬ 
mentation  of  the  proposed  method.  Considering  the  fact 
that  in  static  wireless  sensor  networks,  the  procedure  of 
estimating  distances  is  usually  executed  only  once,  and 
probably  in  the  beginning  of  the  network  lifetime,  the 
ratio  Pj /  Pc  can  be  initially  set  a  high  value  to  achieve 
a  high  ‘sensor  density’  by  increasing  Pj  and/or  decreas¬ 
ing  Pc  and  then  is  tuned  to  be  a  normal  value  after  the 
phase  of  estimating  distances.  As  a  result,  more  estimates 
of  long  distances  with  comparatively  good  accuracies  will 
be  available. 


6.  IMPLEMENTING  THE  METHOD 
IN  PRACTICE 

In  this  section,  we  improve  the  proposed  method  when 
dealing  with  short  distances  and  then  test  it  in  a  practical 
environment. 


6.1.  Dealing  with  short  distances 


Given  a  and  o^g,  define  ea,adB  to  be  the  RMSE  when 
d  =  0.  As  illustrated  in  Figure  5,  the  RMSE  of  dis¬ 
tance  estimates  produced  by  the  proposed  method  expe¬ 
riences  small  variations  as  d  increases  from  0  up 
to  djn,  so  that  if  i i  >  ea,<jdB  the  RMSE  tends  to  be 
under  d,  implying  that  relatively  good  performance  is 
attained.  Moreover,  on  the  grounds  of  the  analysis  in 
Section  4.3,  we  focus  on  the  function  defined  by  (18)  with 
P  6  [Pa.ddB  (ea,a dB).  1]  and  reformulate  it  by  a  linear  func¬ 
tion  [(1 -p)6a,CTdB]/[l  “Pa,o-dB(6a,(7dB)]  that  smoothly 
transforms  any  p  between  Pa,o-dBdB(ea,o-dB )  and  1  to  a  dis¬ 
tance  estimate  between  0  and  ea;(TdB.  Consequently,  (18)  is 
updated  to  be 


d  = 


0, 

(1 —p)€a.adB 

1  -A*,<7dB(ea,CTdB)  ’ 

Pa,crdB(P)’ 


d{U  ■ 


if  M  =  P  =  Q  =  0; 


if  P  >  Pa,crdB  (fa,(jdB)i 
if  pa 

,CTdB  (4h)  <  P 
—  Pa, ad b  (ea,crdB); 
if  P  <  Pa, ad b  (tAh  ) 


(24) 


6.2.  Test  the  method  in  a  real  environment 


In  [30],  a  sensor  network  consisting  of  44  nodes  was 
deployed  in  a  real  environment,  and  RSS  measurements 
between  any  two  nodes  were  reported.  On  the  basis  of  their 
measurement  data,  we  can  simulate  a  realistic  environment 
to  implement  our  method.  According  to  [30],  a  =  2.3, 
<rdB  =  3.92,  and  -Pr(Ro)  =  —37.47  dBm.  But  to  proceed 
with  the  experiment,  we  also  need  to  specify  the  threshold 
power  Pc,  which  essentially  defines  whether  two  nodes  are 
‘connected',  and  eat(TtlB  in  (24).  After  that,  we  can  com¬ 
pute  the  function  g(d)  associated  with  this  channel  and 
then  obtain  the  distance  estimators  on  the  basis  of  (18)  and 
(24),  respectively.  To  avoid  boundary  effects  as  much  as 
possible,  we  consider  the  four  nodes  near  the  center  of  the 
deployment  region,  that  is,  nodes  15,23,24,  and  25,  and 
only  estimate  the  inter-node  distances  associated  with  the 
four  nodes  by  using  the  originally  proposed  method  and 
the  method  with  the  adjustment. 

In  this  experiment,  by  letting  6a,o-dB  be  0.5r  and  raising 
Pc  from  —61  to  —52  dBm,  the  average  distance  estimation 
errors  incurred  by  the  original  and  adjusted  methods  are 
listed  in  Table  II.  By  the  distance  estimates  produced  by  the 
RSS  method  (which  were  also  provided  by  [30]),  we  com¬ 
pute  the  corresponding  average  distance  estimation  error, 
that  is,  1.07  m.  As  depicted  in  the  table,  (i)  the  adjusted 
method  always  outperforms  the  original  method;  (ii)  the 
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Table  II.  The  experimental  results  with  respect  to  different  Pc. 


Pc  (dB  m) 

r  (m) 

ND 

DEE  (m) 

DEE  adjusted  (m) 

-52 

4.28 

8.50 

1.00 

0.44 

-53 

4.73 

11.25 

1.49 

0.56 

-54 

5.23 

14.00 

1.95 

0.95 

-55 

5.78 

17.25 

1.78 

0.74 

-56 

6.38 

20.75 

1.81 

0.70 

-57 

7.05 

27.25 

2.00 

1.08 

-58 

7.80 

31.75 

2.08 

1.41 

-59 

8.62 

34.50 

2.11 

1.57 

-60 

9.52 

37.00 

2.15 

1.72 

-61 

10.53 

38.75 

2.14 

1.67 

ND,  average  node  degree;  DEE,  average  distance  estimation  error. 


original  method  outperforms  the  RSS  method  only  in  the 
case  of  Pc  =  —52  dBm,  whereas  the  adjusted  method 
outperforms  the  RSS  method  if  Pc  (dBm)  is  between 
—56  and  —52  dBm;  and  (iii)  although  the  average  node 
degree  increases  with  Pc  decreasing,  the  average  error 
obtained  with  the  adjusted  method  increases  in  general,  a 
phenomenon  which  is  attributable  to  boundary  effects. 

7.  APPLYING  THE  METHOD 
IN  LOCALIZATION 

In  this  section,  we  report  the  improvement  in  connectivity- 
based  sensor  localization  by  using  the  proposed  method. 

7.1.  Connectivity-based 
localization  algorithms 

Connectivity-based  localization  algorithms,  for  example, 
[6,8],  generally  involve  as  a  crucial  component  a  mech¬ 
anism  of  converting  connectivity  information  into  rough 
distance  estimates,  which  are  then  used  for  localization  as 
in  distance-based  localization  algorithms. 

The  DV-hop  scheme  proposed  in  [6]  employs  distance 
vector  exchange.  Both  sensor  and  anchor  exchange  dis¬ 
tance  tables  that  contain  the  locations  of  and  the  hop  counts 
to  anchors  with  their  corresponding  neighboring  nodes. 
Once  an  anchor  obtains  the  distance  tables  from  other 
anchors,  it  estimates  an  individual  average  distance  per  hop 
and  broadcasts  this  average  distance  into  the  network.  A 
sensor  approximates  its  geographic  distance  to  an  anchor 
by  multiplying  the  hop  count  to  this  anchor  and  the  associ¬ 
ated  average  distance  per  hop  and  then  estimates  its  loca¬ 
tion  by  performing  trilateration  if  a  sufficient  number  of 
distance  estimates  are  obtained.  Its  variant,  that  is,  the  DV- 
distance  scheme,  is  almost  the  same  as  the  DV-hop  scheme 
except  that  it  employs  the  geographic  distances  measured 
with  the  use  of  radio  signals  other  than  hop  counts. 

Multi-dimensional  scaling  map  (MDS-MAP)  proposed 
in  [8]  approximates  the  distance  between  two  nodes  by 
the  length  of  their  shortest  path  and  then  uses  multidimen¬ 
sional  scaling  to  generate  a  relative  map  that  represents 


the  relative  positions  of  nodes.  Once  a  sufficient  number 
of  anchors  are  known,  MDS-MAP  estimates  the  absolute 
coordinates  of  all  the  sensors  in  the  network.  Like  DV- 
distance,  MDS-MAP  can  also  employ  geographic  distance 
measurements;  we  term  it  MDS-MAP  distance. 

In  both  DV-hop  and  MDS-MAP,  the  distance  between 
two  nodes  is  roughly  estimated  according  to  the  length 
of  the  shortest  path  between  them,  namely  that  the  one- 
hop  distances  along  any  shortest  path  are  assumed  to  be 
equal.  As  opposed  to  this  assumption,  our  method  provides 
comparatively  accurate  estimates  of  one-hop  distances  and 
thus  helps  to  improve  the  quality  of  connectivity-based 
sensor  localization,  which  will  be  demonstrated  in  the 
following  subsection. 

Many  methods  have  been  developed  in  the  literature  to 
improve  DV-hop.  For  instance,  in  [31],  estimating  the  dis¬ 
tance  from  a  sensor  to  an  anchor  not  only  uses  the  length 
of  the  shortest  path  between  them  as  in  DV-hop  but  also 
exploits  the  lengths  of  the  shortest  paths  from  this  sensor’s 
immediate  neighbors  to  the  anchor.  Moreover,  DV-hop 
suffers  large  errors  in  anisotropic  networks,  because 
the  estimates  of  distances  from  sensors  to  anchors  can 
be  extraordinarily  inaccurate.  Accordingly,  [32-34]  were 
developed  to  alleviate  the  impacts  of  the  anisotropic 
network  topology  on  the  estimates  of  distances.  Because 
comparatively  accurate  estimates  of  one-hop  distances  pro¬ 
vided  by  our  method  are  the  basis  for  estimating  distances 
from  sensors  to  anchors,  it  is  attractive  to  combine  our 
method  with  these  DV-hop  related  methods  to  improve  the 
estimates  of  distances  from  sensors  to  anchors  and  thus  to 
improve  localization  accuracy. 

7.2.  Simulations 

We  simulate  connectivity-based  sensor  localization  under 
the  log-normal  model  using  DV-hop  and  MDS-MAP  and 
their  distance-based  counterparts  DV-distance  and  MDS- 
MAP  distance  (with  distance  measurements  from  the 
adjusted  method). 

To  avoid  boundary  effects,  we  actually  generate  wire¬ 
less  sensor  networks  over  a  large  square  with  side  of  18 
but  only  localize  the  nodes  inside  of  a  small  one  with  side 
of  6  and  concentric  to  the  large  one.  However,  the  nodes 
outside  of  the  small  one  are  sometimes  used  in  estimat¬ 
ing  distances  between  nodes  inside.  Four  nodes  inside  the 
small  square  and  closest  to  its  four  corners  are  chosen  as 
anchors.  Regarding  the  constants  parameterizing  the  log¬ 
normal  model,  a  is  known  to  be  4,  a^B  takes  values  from 
0, 4,  8, 12,  and  Pj  and  Pc  are  properly  assigned  such  that 
r  =  1.  Furthermore,  A  takes  proper  values  such  that  /x 
varies  from  10  to  40  with  step  size  5  (see  Table  I). 

For  each  choice  of  aj b  and  /x,  100  independent  runs 
are  carried  out.  In  each  run,  first,  a  static  wireless  sensor 
network  is  generated  according  to  a  homogeneous  Poisson 
process  of  density  A;  second,  distance  between  any  pair  of 
neighboring  nodes  is  estimated  on  the  basis  of  Theorem  1, 
and  accordingly,  the  average  absolute  distance  estimation 
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DV-hop  and  DV-distance  are  almost  the  same  except 
that  DV-hop  assumes  identical  one-hop  distances  along  any 
shortest  path,  whereas  DV-distance  uses  distance  estimates 
with  comparatively  good  accuracies;  this  is  also  true  for 
MDS-MAP  and  MDS-MAP  distance.  Because  only  con¬ 
nectivity  information  with  the  assistance  of  four  anchors  is 
exploited  to  realize  sensor  localization  in  the  simulations, 
the  fact  that  DV-distance  and  MDS-MAP  distance  use  dis¬ 
tance  estimates  produced  by  our  method  and  their  superior 
performance  imply  the  advantages  of  our  method. 

8.  CONCLUSIONS 


Figure  6.  Average  absolute  distance  estimation  errors  with  a  = 
4  and  r =  1 . 


error  is  computed;  third,  sensors  are  localized  by  using  four 
localization  algorithms:  DV-hop,  MDS-MAP,  and  their 
distance-based  counterparts  DV-distance  and  MDS-MAP 
distance  (with  distances  coming  from  the  second  step);  and 
finally,  the  average  absolute  distance  estimation  error  and 
the  average  position  estimation  error  are  computed  for  each 
localization  algorithm.  Then,  the  average  absolute  distance 
estimation  errors  and  the  average  position  estimation  errors 
are  averaged  over  the  100  independent  runs  and  plotted  in 
Figures  6  and  7. 


In  this  paper,  we  proposed  the  method  of  estimating  dis¬ 
tances  via  connectivity  in  static  wireless  sensor  networks 
by  dealing  with  a  generic  channel  model,  including  the 
realistic  log-normal  model.  The  proposed  method  is  not 
relying  on  extra  hardware,  totally  distributed  and  energy 
efficient  due  to  its  simple  mechanism  and  computations. 
Under  the  log-normal  model,  the  bias  and  standard  devi¬ 
ation  of  distance  estimates  from  the  proposed  method 
were  numerically  evaluated  and  verified  by  simulations; 
the  proposed  method  outperforms  the  RSS  method  for  long 
distances;  a  CRLB  analysis  was  carried  out  for  the  prob¬ 
lem  of  estimating  distances  using  connectivity,  and  useful 
guidelines  for  implementing  the  proposed  method  were 
derived;  the  influences  of  uncertainties  in  the  log-normal 
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Figure  7.  Average  position  estimation  errors  with  a  =  4  and  r  =  1 
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model  were  examined;  and  finally,  extensive  simulations 
confirmed  the  advantages  of  the  proposed  method.  In 
future  work,  we  may  tackle  estimating  distances  involving 
nodes  near  the  network  boundaries,  implement  the  method 
in  a  realistic  environment  and  combine  the  method  with  the 
DV-hop  related  methods  to  make  further  improvement. 


APPENDIX  A:  PROOF  OF 
THEOREM  1 

Establish  a  statistical  model:  observations  of  M ,  P  and  Q 
provide  measured  data  <p  =  [m  p  q]  where  m.p.q 
are  non-negative  integers;  the  unknown  parameters  are 
9  =  [d  A  ].  By  formulating  the  likelihood  function,  we 
can  derive  that  the  MLE  for  d  is  the  solution  for  d  in  the 
following  equation  set: 


p  +  q 


f(d)  S-f{d) 
m  +  p  +  q 


+  A  =  0, 


(25  -f(d))  =  0 


(25) 

(26) 


By  eliminating  A,  we  can  obtain 


2m  S  =  (2m  +  p  +  q)f  (d)  (27) 


independent  from  each  other.  Evidently,  M ,  P ,  and  Q  rep¬ 
resent  the  numbers  of  nodes  belonging  to  the  first  three 
cases.  In  addition,  let  the  random  variable  L  denote  the 
number  of  nodes  belonging  to  the  last  case.  Because  of 
Pi  +  P2  +  P3  +  P4  =  l,  M ,  P .  Q ,  and  L  follow  a  multi¬ 
nomial  distribution  with  parameters  N  and  pi,p2.P3. 
and  p4.  Considering  N  is  Poisson  with  mean  A  D,  from 
the  theorem  on  page  8  in  [13],  it  follows  that  M ,  P ,  Q, 
and  L  are  mutually  independent  Poisson  random  variables 
with  means  XDp1.XDp2.XDp2 ,  and  XDp\,  respectively. 
Now,  we  let  the  region  approach  the  infinite  plane  and 
can  conclude  that  M ,  P ,  and  Q  are  mutually  independent 
Poisson  with  finite  means. 

APPENDIX  C:  THE  EXISTENCE 
OF  f'(d ) 

At  first,  consider  the  following  expression: 

lim  ^ —  ^g(-\A:2  +  d2  —  2 xd  cos  9) 

-g(-Jx2  +  (d  +  s )2  —  2 x(d  +  e) cos 9)  j 

— (/:(log  2xd  cosfl—  log/~))^ 

&(xcos0  —  d)e  2<JdB 

A/2jro'dB(*2  +  d2  —  2 xd  cos  9) 

which  is  bounded  for  x  e  [0,  +oo).  Moreover,  the  deriva¬ 
tive  of  / (d)  can  be  formulated  as 


If  2m  +  p  +  q  >  0,  d  =  f~1{[2m/(2m  +  p  +  ?)]5}; 
otherwise,  the  solution  for  d  is  not  well  defined,  but 
because  f  (d)  =  5  maximizes  the  likelihood,  we  have 
d  =  /_1  (5).  Thus,  we  prove  the  theorem. 


APPENDIX  B:  PROOF  OF 
THEOREM  2 

Let  the  two  nodes  be  A  and  B  and  consider  a  finite  region 
of  area  D  covering  A  and  B.  As  pointed  out  in  [13], 
apart  from  A  and  B.  the  rest  of  the  Poisson  process  is 
not  affected,  namely  that  the  number  of  remaining  nodes 
in  this  region,  denoted  N ,  is  still  Poisson  with  mean  XD. 
Choosing  an  arbitrary  node  C  from  the  N  nodes,  one  of  the 
following  cases  hold:  (i)  C  can  directly  communicate  with 
A  and  B;  (ii)  C  can  directly  communicate  with  A  but  not 
B;  (iii)  C  can  directly  communicate  with  B  but  not  A;  and 
(iv)  C  cannot  directly  communicate  with  A  or  B. 

Conduct  a  trial  for  each  of  the  N  nodes  to  decide  how 
it  communicates  with  A  and  B,  and  each  trial  results  in 
the  above  four  cases  with  probabilities  p\.  P2,  P3.  and 
P4,  respectively.  Because  of  the  independence  of  connec¬ 
tivity  assumed  in  Assumption  1,  the  N  trials  are  then 


POO  p2  7Z 


POO  pZ 

fid)  =  /  / 

Jo  Jo 

x  g(.\')*  lim  ^ x2  +  d2  —  2  xd  cos  9) 
~g(  J x2  +  (d  +  e)2  —  2 x(d  +  e)  cos  9)  ^  A9Ax 


(28) 


Because  g(x)xd9dx  equals  E(M  +  P)  and  is 

convergent,  f'(d)  is  also  convergent. 


APPENDIX  D:  PROOF  OF 
THEOREM  3 


Regarding  r  as  a  variable,  we  substitute  the  notations 
as  follows:  5  — >  S(r),  f  (d)  —>■  f(r,d),  f'(d) 

3/ (r.  d)/dd,  CRLB(rf)  — »■  CRLB(r,  d),  g(d)  — »•  g(r .  d). 
According  to  (14),  we  have 


g(r,dr ) 


L 


k  log  d 


z2 


x/ZjrcrdB 


d  z  =  g{l,d) 
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By  (13),  we  have 
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fir, dr)  = 
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g(r,x) 
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oo  p2jt 
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Moreover,  we  can  obtain 

df(x,y)  _  1  x  df(x,y) 

3y  x=i  ,y=d  ''  9>’ 

By  5(r)  =  r2S(l)  (based  on  (15)),  (23)  and  the  above 
formulas,  we  can  obtain 


CRLB(r,<ir)  =  CRLB(l,t/)  equivalently, 
CRLB(r,tf)  =  CRLB  fl,  y) 


(30) 
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A  Generic  Bias- Correction  Method  with  Application  to  Scan-Based 

Localization 


Yiming  Ji,  Changbin  Yu  Brian  D.O.  Anderson  and  Samuel  P.  Drake 


Abstract — In  previous  work  a  method  was  proposed  to 
determine  the  bias  in  localization  algorithms  using  range  or 
bearing  data.  In  this  paper  the  method  is  extended  to  be  more 
generic;  in  particular,  different  types  of  measurement  data  are 
permitted,  and  there  may  be  more  measurements  than  there  are 
variables  to  estimate.  The  method  combines  the  Taylor  series 
and  Jacobian  matrices  to  determine  the  bias,  and  leads  to  an 
easily  calculated  analytical  bias  expression,  despite  the  general 
unavailability  of  analytic  expressions  for  the  solution  of  most 
localization  problems.  The  method  is  used  to  estimate  the  bias  in 
scan-based  localization.  Monte  Carlo  simulation  results  verify 
the  performance  of  the  proposed  method  in  this  context. 

Keywords:  Bias;  Taylor  series;  Scan-based  localization;  Ge¬ 
olocation;  Passive  Localization;  Targeting;  Tracking 

I.  Introduction 

Bias  is  a  term  in  estimation  theory  and  is  defined  as  the  dif¬ 
ference  between  the  expected  value  of  a  parameter  estimate 
and  its  true  value  [1],  If  the  bias  of  a  particular  estimation 
scheme  can  be  estimated  then  it  can  be  removed.  Of  course, 
in  a  particular  single  instance  of  the  estimation  problem,  the 
correction  may  worsen  the  quality  of  the  estimate;  but  over 
a  number  of  experiments,  it  may  be  expected  to  improve  the 
quality  of  the  estimate,  by  lowering  the  mean  square  error. 

A  common  class  of  estimation  problems  are  those  lo¬ 
calization  estimates,  i.e.  determination  of  the  position  of 
a  target,  e.g.  a  sound  emitter,  or  an  electromagnetic  wave 
emitter,  using  some  form  of  measurements,  such  as  range, 
time  difference  of  arrival,  bearing,  etc.  In  almost  all  practical 
localization  situations,  measurement  errors  are  inevitable,  and 
these  lead  to  errors  in  estimating  the  true  target  position. 
In  [2]  Picard  et  al.  discussed  several  models  for  estimating 
the  bias  in  the  range  measurements,  and  presented  a  set 
of  iterative  algorithms  that  minimize  the  bias  and  provide 
maximum  likelihood  position  estimates.  In  [3],  Gavish  et 
al.  presented  analytical  expressions  of  bias  which  permit 
performance  comparison  for  two  well  known  bearing-only 
location  techniques,  viz.  the  maximum  likelihood  and  the 
Stansfield  estimators.  Further  in  [4],  Drake  et  al.  presented 
an  introduction  to  tensor  algebra  with  some  application 
examples  in  estimation  theory.  One  of  the  tensor  algebra 
applications  proposed  in  the  paper  treats  the  bias  in  estimat¬ 
ing  a  nonlinear  function  of  a  variable  of  which  one  has  an 
observation  contaminated  by  zero  mean  additive  noise.  The 
method  involves  expanding  the  nonlinear  function  around 
the  noiseless  observation  value  using  a  Taylor  series  which 
is  truncated  at  the  second  order.  The  expected  value  of  the 
first  order  term  is  zero  and  the  expected  value  of  the  second 
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order  term  introduces  a  bias.  As  an  example  the  localization 
of  a  target  is  considered  using  noisy  range  and  bearing 
information  from  a  monostatic  radar;  with  independent  and 
gaussian  zero  mean  noises  contaminating  each  measurement, 
there  is  a  systematic  bias  in  the  target  estimate,  such  that  the 
estimated  position  on  average  is  closer  to  the  radar  than  the 
true  target  position. 

In  our  previous  work  [5,  6,  7],  we  proposed  a  method 
to  determine  and  correct  the  bias  in  localization  algorithms. 
We  hypothesized  the  existence  of  a  localization  mapping 
g  (which  maps  the  vector  of  measurements  to  a  position 
vector  estimate).  Following  the  lead  of  [4],  we  viewed  this 
mapping  using  a  Taylor  series  expansion  around  the  nominal 
noiseless  measurements  that  in  principle  are  associated  with 
the  correct  target  position.  The  expansion  is  to  second-order 
in  the  measurement  noise  as  the  first  order  term  has  zero 
mean  and  the  expected  values  of  the  second-order  term 
as  expressible  in  terms  of  the  derivatives  of  g.  However, 
in  a  localization  problem  it  may  actually  be  very  hard  to 
calculate  the  derivatives  of  g  analytically.  In  contrast,  the 
inverse  mapping  of  g  (call  it  f)  which  maps  the  target  position 
to  a  (noiseless)  sensor  measurement  can  often  be  obtained 
analytically,  together  with  its  derivatives.  Therefore,  we  in¬ 
troduce  the  Jacobian  matrix  of  f  to  compute  the  derivatives 
of  the  localization  mapping  g  in  terms  of  the  derivatives  of  f, 
resulting  in  a  simple  calculation  of  bias.  In  comparison  with 
the  approach  presented  in  [3],  the  Monte  Carlo  simulation 
results  demonstrate  a  clearly  better  performance  for  our  bias 
correction  method  [7]. 

In  our  previous  work  however,  the  analysis  of  the  proposed 
method  was  restricted  to  localization  problems  using  only 
range  or  bearing  measurements.  In  this  paper  we  present 
the  bias  correction  method  in  a  more  generic  way  allowing 
an  arbitrary  number  of  noisy  measurements  which  are  not 
restricted  to  being  either  range  or  bearing.  To  demonstrate 
the  performance  of  the  proposed  method,  the  generic  bias 
correction  method  is  applied  by  way  of  example  to  scan- 
based  localization  algorithms,  which  have  not  been  analyzed 
in  our  previous  work.  In  the  process  of  applying  the  proposed 
method  to  scan-based  localization,  the  original  bias  correc¬ 
tion  method  needs  to  be  adjusted  in  a  minor  way  to  allow 
for  certain  correlations  in  the  measurement  errors. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section 
II  the  background  and  motivation  of  our  work  are  presented. 
The  generic  bias  correction  method  is  proposed  in  Section 
III.  In  Section  IV,  we  apply  the  bias  correction  method  to 
the  scan-based  localization  problem.  Monte  Carlo  simulation 
results  are  also  provided  in  Section  IV.  Section  V  summarizes 
the  main  results  of  the  paper. 

II.  Background  and  Motivation 
A.  Background 

A  brief  review  of  estimation  bias  will  be  presented  in  this 
section. 

Let  x  =  (xi,X2,  ■  ■  ■  ,xn)T  and  0  =  {0ll02, . . .  ,0N)T 

denote  the  target  position  and  noiseless  measurement  vector 
respectively.  Let  f :  x  — »  ©  denote  the  associated  mapping, 
which  is  almost  always  analytically  computable.  Let  g :  ©  — > 
x  denote  the  inverse  localization  mapping;  thus  with  ©  = 
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f(x),  there  holds  x  =  g(@).  In  order  that  this  mapping  can 
be  well  defined,  it  is  necessary  that  N  >  n.  In  case  N  >  n, 
usually  the  set  of  equations  f(x)  =  ©  yielding  x  from  © 
will  be  overdetermined,  while  in  case  N  =  n,  there  may 
be  two  or  more  solutions  (but  for  generic  geometries  only 
a  finite  number);  in  this  case,  the  selection  of  the  correct 
solution  requires  some  additional  information  regarding  the 
target  position. 

In  practice,  noise  in  the  measurements  is  inevitable.  We 
denote  this  noise  by  6®  =  (S9\,  592, ...,  50n)T-  The  <56^ 
are  generally  assumed  to  be  independent  Gaussian  random 
variables  with  zero  mean  and  known  variances  of,  which 
may  be  the  same.  The  error  in  the  target  position  resulting 
from  an  estimation  procedure  using  noisy  data  is  denoted  as 
Sx.  If  we  define  ©  =  ©  +  <$@  and  x  =  x  +  bx  the  localization 
amounts  to  solving  f(x)  =  0  for  x.  If  the  function  g  is  known, 
then  in  effect  we  are  implementing  the  following  equation 

x  =  x  +  Sx  =  g(0  +  (50)  =  g(0)  (1) 

If  g  is  a  non-linear  function  then  this  process  will  lead  to 
bias  in  the  target  position  estimate  [8],  Suppose,  by  way  of 
a  thought  experiment,  that  in  estimating  the  value  of  x  the 
measurement  process  equation  (1)  was  repeated  M  times.  As 
M  — >  oo,  the  average  of  the  estimates  would  go  to  : 

E[xi}=E[gi{®)}  (2) 

Now  note  that  if  <?,;  is  nonlinear  we  would  have: 

E[xi\  =  E[gi(Q)}  ±  gi(E[Q})  =  &(©)  =  Xi 

The  bias  in  our  estimated  of  x,j  is  defined  as  the  difference 
between  the  expected  value  of  Xj  and  the  true  value  of  ,:c,, 
i.e. 

BiasXi  =  E[xi]  -  Xi  =  E[g{xi)]  -  xt 

If  computable,  the  bias  can  be  used  to  systematically 
correct  any  single  estimate  from  any  single  measurement  set. 

B.  Motivation 

From  the  above  analysis,  we  can  see  that  if  the  estimation 
mapping  g  is  nonlinear  and  the  sensor  measurements  are 
noisy,  bias  is  present.  In  practice,  these  two  conditions  are 
presented  in  most  localization  scenarios.  Since  bias  is  a 
systematic  and  possibly  computable  error,  it  is  desirable  to 
remove  it. 

In  [4],  Drake  et  al.  gave  a  short  introduction  to  tensor 
algebra  and  provided  a  few  sample  applications.  One  such 
application  was  concerned  with  bias  arising  in  non-linear 
estimation  problems  with  noisy  measurements.  To  determine 
the  bias  they  considered,  as  we  do,  Xi  =  gi(9).  Assuming 
the  estimator  mapping  g  is  well  defined,  they  expanded  the 
function  g,  by  a  Taylor  series  and  truncated  it  at  second  order: 

Xi  +  Sxi  =  gi(9i,92,...,9N) 

=  gi{9i  +  S9i,  92  +  S92,  9n  +  59 n) 

N  3 

~  gj{9i,92, ..., fljy)  +  yf  j 

j= i  3 


E(Sxi) 
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i= i 
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(3) 


For  some  estimators,  the  mapping  g  can  be  obtained 
analytically.  However  in  some  situations  including  many 
localization  problems,  finding  the  analytic  g  analytically  is 
very  hard  or  even  impossible.  If  g  can  not  be  obtained 
then  equation  (3)  can  not  be  evaluated,  and  we  need  a  new 
method  to  analytically  obtain  the  derivatives.  The  bias  in 
our  estimated  of  x,  is  defined  as  the  difference  between  the 
expected  value  of  Xi  and  the  true  value  of  x, . 

The  key  to  do  this  is  to  notice  that  g  is  the  inverse  of 
the  mapping  f  for  which  often  an  analytic  form  is  known. 
Below  we  show  how  to  use  the  mapping  f  and  its  derivatives 
to  calculate  the  derivatives  of  g  using  the  Jacobian  identity, 
ultimately  resulting  in  an  estimate  of  the  bias. 


III.  Bias  Correction  Method 

To  begin,  we  first  assume  N  =  n  (the  number  of  sensor 
measurements  N  is  equal  to  the  dimension  of  position 
coordinates  of  the  target  n).  We  assume  further  that  f  is 
a  known  analytic  function.  Because  f  and  g  are  inverse 
mappings,  the  Jacobian  identity  holds: 


r 

Oh  1 

■  9gi 

dgi  ~ 

dx  1 

dxn 

d91 

dOisr 

dfN 

dfN 

dgn 

dpn 

-  dx  1 

dxn  - 

L  d0! 

dOu  - 

By  rearranging  the  equation  set  (4)  we  can  obtain  analyt¬ 
ical  expressions  for  the  ^  (i  =  1,2 =  1,2 

in  terms  of  and  thus  as  analytic  functions  of  the  x^.  For 
ease  of  exposition  we  use  <7*  to  denote  the  expressions  of  ^ 
as  functions  of  x\,  x2 , ...,  xn.  To  obtain  second  derivatives,  let 
us  use  as  an  example  to  illustrate  the  general  approach. 
Starting  with 


(5) 


we  differentiate  with  respect  to  Xi  first,  and  thereby  obtain 

dgi  dfi  I  I  dgi  dfi  I  4,  dgi  ®/n  if  wp 

dO  12  dxi  '  *“  '  dOi dOi  dx i  '  '  d0\d6 n  dx i  dx i' 

further  differentiate  the  equation  (5)  with  respect  to  x2, ....  xn 
we  can  obtain  an  equation  set  as  follows: 


r  ah 
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dx  1 

dx\ 

dfi 

dfN 
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Assuming  as  is  often  the  case  that  the  measurement  errors 
are  independent  Gaussian  random  variables  with  zero  mean 
and  known  variance  (the  variance  of  measurement  errors 
would  have  to  be  obtained  from  manufacturer  or  experimental 
data),  the  approximate  bias  expression  is: 


Note  that  the  quantities  on  the  right  side  of  this  equation 
are  all  expressible  analytically  as  functions  of  xi,x2,  ■■■,xn. 
Likewise  the  entries  Sfi-  in  the  matrix  on  the  left  are  known 

dxj 

functions  of  x±,  x2,  ...xn.  Hence  by  solving  the  equation 
set  (6),  we  can  obtain  a  formula  for  as  a  function 

of  #i,  x2, ...,  xn.  The  formulas  for  for  all  i,j  can 

be  obtained  in  the  same  way.  Substituting  the  formulas 
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Fig.  1.  Introduce  one  extra  variable  (Here  N=3  and  n=2).  The  surface  is 
the  set  of  points  (01,02,03)  =  (f\  (x,  y),  /2(:c,  y),  fz(x,  y))  obtained  as 
x,  y  vary. 


into  equation  (3)  we  can  finally  obtain  the  easily-calculated 
expressions  for  the  bias  in  terms  of  /,;  and  its  derivatives. 

The  calculation  is  used  as  follows.  We  first  obtain  the 
estimated  position  of  the  target  by  using  an  existing  local¬ 
ization  algorithm.  Then  we  can  input  the  estimated  target 
location  into  the  obtained  analytical  expression  for  the  bias. 
Finally  we  can  improve  (on  a  statistical  basis  at  least)  the 
accuracy  of  the  localization  by  subtracting  the  obtained  bias, 
viz.  Xi  —  biasXi(i  =  1,2,  ...,n).  (The  process  could  in  fact 
be  iterated,  but  the  effect  of  even  one  more  iteration  on  the 
corrected  position  estimate  will  almost  always  be  marginal). 

If  the  number  of  measurements  N  is  greater  than  the 
number  of  scalar  position  variables  n,  we  cannot  obtain  the 
equations  (4)  and  (6),  and  so  we  cannot  straightforwardly 
express  the  bias  using  the  derivatives  of  f.  Indeed,  while 
the  noise-free  equation  f(x)  =  0  is  overdetermined,  the 
noisy  equation  f(x)  =  0  in  general  will  have  no  solution. 
The  localization  problem  is  typically  solved  by  something 
like  a  least  squares  approach1,  and  for  the  purposes  of  bias 
determination,  we  build  on  this  approach  too,  to  introduce 
N  —  n  extra  variables  into  f  thereby  making  n  =  N. 

For  ease  of  exposition,  here  we  take  N  =  n  +  1,  which 
means  just  one  extra  variable  needs  to  be  introduced.  Con¬ 
sider  .Y-dimensional  space,  with  axes  corresponding  to  the  N 
measurements.  Assume  an  (N—  1) -dimensional  hypersurface 
(illustrated  in  Figure  1  for  the  case  N  =  3)  consists  of 
points  which  correspond  to  all  sets  of  noiseless  measurements 
(01,02,-, 8n),  i-e.  0i  =  fi(x1,x2, -xn)  for  i  =  1,2 
According  to  the  least  squares  method,  we  can  consider 
choosing  x\,  x2, ...,  x„  to  minimize  the  following  cost  func¬ 
tion: 

N  N 

^cost-function  (x>  ®)  =  —  ^*)  =  ^ 

1=1  2=1 

In  fact,  the  least  squares  method  attempts  to  find  a  point 
(0i,  02l ... ,9n )  (the  white  point  in  Figure  1)  on  the  surface 
corresponding  to  an  obtained  set  of  noisy  measurements 


1  The  least  squares  approach  is  equivalent  to  a  maximum  likelihood 
approach  when  all  noise  variances  are  the  same  (and  measurement  noises  are 
independent  zero  mean  Gaussian  random  variables).  Weighted  least  squares 
can  capture  variations  on  this. 


(01, 02, ...,  9n)  (the  black  point  in  Figure  1  which  is  gener- 
ically  off  the  surface)  to  minimize  the  distance  between  the 
two  points.  Hence  the  white  point  must  be  the  orthogonal 
projection  of  the  black  one  onto  the  surface,  or  the  black 
point  must  be  on  the  normal  vector  to  a  tangent  plane  of  the 
surface  passing  through  the  white  one. 

The  n  tangent  vectors  at  any  point  on  the  n  dimensional 
surface  are  given  by  vectors  as  follows: 


dfi  df2  dfN  T 

dxi  ’  dxi  ’  ’  dxi 


i  =  1,2, ...,  n 


(8) 


The  normal  vector  to  the  surface  (u)  is  formed  from  the 
cross  product  of  the  tangent  vectors  (v7;),  that  is:  u  [9]  of  the 
surface  through  the  white  point: 


U  =  [ui,U2,  -,UN]T  =  Vi  X  V2...  X  vn  (9) 

Note  that  in  the  noiseless  case  0  =  f(a:i,a;2, ... xn )  where 
f  can  be  written  down  easily  according  to  the  geometry  of 
the  sensors  and  target.  The  black  point  can  be  regarded 
as  an  image  obtained  with  a  new  analytical  mapping  F  = 
(Fi,  F2, ...,  Fn)t  :  RN  — >  Rn  corresponding  to  moving 
from  f,  which  is  a  known  function  of  xi,x2,  ■■■Xn  along  the 
normal  vector  for  a  distance  e||u||.  The  new  mapping  F  is  a 
known  analytic  function  of  xi,x2,  ...x„  and  e: 


0  =  F(x,  e)  =  f(x)  +  eu  (10) 

(Of  course,  u  is  an  analytic  function  of  xi,X2>  ■■■■Xn). 

Introduction  of  the  extra  variable  e  means  that  equation 
(10)  is  not  an  overdetermined  equation  set,  and  F  is  in 
principle  invertible.  Therefore  we  can  consider  the  new 
localization  mapping  (call  it  G)  as  the  inverse  mapping  of 
F.  We  can  then  proceed  along  the  same  lines  as  previously 
to  determine  the  bias. 

When  the  number  of  input  measurements  N  exceeds  n+ 1, 
the  situation  is  similar  to  the  case  N  =  n  +  1.  More  details 
can  be  obtained  in  [6,  7], 


IV.  Application  to  the  Scan-Based  Localization 
Problem 

In  order  to  demonstrate  the  effectiveness  of  the  proposed 
bias-correction  method,  we  apply  the  proposed  method  to  the 
scan-based  localization  problem,  which  is  studied  in  [10]. 

A.  Review  of  Scan-Based  Localization 

The  target  is  assumed  to  be  an  emitter  with  a  (generally 
mechanically)  rotating  radar  antenna  with  a  narrow  beam;  the 
scan  direction  and  scan  rate  are  assumed  constant  and  indeed 
known  to  each  receiver,  which  records  the  time  instants  at 
which  the  rotating  beam  passes  the  receiver  [10].  For  ease  of 
exposition,  here  we  consider  a  situation  with  one  emitter  and 
three  receiving  sensors,  and  assume  that  all  lie  in  a  plane. 
Figure  2  shows  an  emitter  scanning  across  three  receivers 
(i.e.,  receiver  1,  receiver  2  and  receiver  3)  at  the  times  l  \ ,  t2 
and  /,•{.  The  emitter  is  scanning  clockwise.  For  localization, 
the  separate  time  values  are  not  important,  but  rather  their 
differences,  t\2  =  t2  —  t\  and  t2 3  =  t$  —  t2,  are.  In 
fact,  we  treat  ti2,  t2 3  as  quasi-measurements.  Each  quasi¬ 
measurement  (together  with  knowledge  of  the  scan  rate  and 
direction)  in  the  noiseless  case  defines  a  circle  of  computable 
center  and  radius,  and  indeed  an  arc  of  such  a  circle  on  which 
the  emitter  must  lie.  The  pair  of  sensors  determining  the  time 
difference  lie  at  the  end  points  of  the  arc.  The  intersection  of 
two  such  circular  arcs  defines  the  emitter  position;  thus  there 
is  a  vector  function  g  =  (gi,g2)T.  the  localization  mapping, 
of  the  two  variables  ti2,t2 3,  with  g  embodying  a  formula 
for  the  intersection  of  two  circular  arcs.  Finding  an  analytic 
expression  for  the  mapping  is  a  significant  challenge. 
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C.  Bias-Correction  in  Scan-Based  Localization 

As  the  first  step  in  obtaining  an  expression  for  the  bias, 
we  Taylor  expand  the  localization  mappings  gi  and  g2  to  the 
second  order  terms,  as  described  in  Section  III.  However, 
in  the  scan-based  localization  problem,  equation  (3)  requires 
adjustment.  To  see  why,  note  that  noise  in  the  time-of-arrival 
(TO A)  measurements  can  be  modelled  as  follows: 

ti  =  ti  +  Sti  i  =  1,  2,  3  (14) 

where  the  Sti  are  assumed  to  be  i.i.d  Gaussian  random 
variables  with  zero  mean  and  known  variance  a2 . 

However  in  scan-based  localization,  the  physical  measure¬ 
ments  are  replaced  by  quasi-measurements  1 12  =  t2  —  1 1  and 
£23  =  £3  —  t2.  This  leads  to 

£12  =  £2  —  £1  =  f  12  +  Sti2  (15) 

£23  =  £3  —  £2  =  t23  +  St23  (16) 

where  <5t12  and  St2 3  are  no  longer  independent  and  have 
covariance  matrix  E  given  by 


Fig.  2.  An  emitter  scanning  across  three  receivers;  The  red  star  indicates 
the  radar’s  location  whereas  the  filled  circles  indicates  the  location  of  the 
receivers. 


The  two  factors  causing  bias  to  arise  in  an  estimate, 
viz.  nonlinearity  of  the  estimation  mapping  and  noise  in 
the  measurements,  are  present  in  any  practical  scan-based 
localization  problem.  Accordingly,  we  seek  to  apply  our 
method  to  use  the  inverse  of  g  (viz.  the  mapping  f  from 
target  position  to  measurements)  to  obtain  a  formula  for  the 
bias. 


E  =  2cr2 


1  -0.5 

-0.5  1 


(17) 


where  2cr2  is  the  variance  of  an  individual  time-difference 
measurement.  Note  that  the  means  of  St  12  and  St23  remain 
zero. 

Now  the  approximate  bias  expression  for  three  receivers 
is  as  follows: 
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B.  Analytic  expression  for  the  mapping  from  emitter  position 
to  quasi-measurements 

Now  we  aim  to  obtain  an  analytic  form  for  the  function 
f  which  maps  the  emitter  position  (x,  y  )  to  the  quasi¬ 
measurements  (t  12,^23)-  Let  ai,a2  denote  the  angles  sub¬ 
tended  at  the  emitter  by  the  lines  joining  it  to  the  two  pairs 
of  physical  sensors,  see  Figure  2.  Since  the  scan  rate  w  is  a 
known  constant,  we  can  obtain  the  following  equations: 

oti  =  wtj.i+i  *  =  1,2  (11) 

Given  the  three  receivers  at  known  locations  pi  =  {x\,yf), 
P2  =  {x2,y2)  and  p3  =  (*3,2/3),  it  is  straightforward  to  see 
that 


at  =  arccos 


where 


d2  +  d2  —  d2 
u0 ,i  t  “o,i+l  ui,i+ 1 

2do,idcM+i 


*  =  1,2 


(12) 


do,i  =  V(x  -  xi)2  +  (2 /  -  Vi)2,  *  =  1,2,3 
diti+ 1  =  y/ ( Xi  -  xi+1)2  +  ( yi  -  2/i+i)2,  *  =  1,2 

Substituting  equations  (12)  into  (11),  we  can  obtain  the 
following  formulas: 


U,i+ 1  =  fi(x,  y )  =  arccos 
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*  =  1,2 


(13) 

These  last  equations  provide  an  analytic  formula  for  f  = 
(/1 ,  f‘2 ) ,  which  is  the  inverse  mapping  of  localization  process 
g  =  (51,52). 

When  the  number  of  receivers  is  larger  than  3,  we  can 
use  the  least-squares  based  method  proposed  in  Section  III 
to  introduce  extra  variables. 


E{8y)  can  be  obtained  in  the  same  way.  The  remaining 
calculations  are  the  same  as  those  described  in  Section  HI. 

D.  Simulation  Results 

In  this  subsection,  the  simulation  results  will  be  shown  to 
demonstrate  the  performance  of  the  proposed  bias-correction 
method  in  scan-based  localization  problems.  All  the  simu¬ 
lated  data  is  provided  by  the  Defence  Science  Technology 
Organization  (DSTO). 

The  simulations  were  done  using  DSTO’s  synthetic  inte¬ 
gration  lab  (SIL).  The  still  accurately  simulates  existing  radar 
and  receiver  systems,  it  includes  accurate,  senor,  emitter, 
terrain  and  propagation  models.  The  fidelity  of  the  SIL  means 
that  there  is  no  significant  difference  between  its  simulation 
results  and  those  generated  by  field  tests.  Simulation  results 
are  provided  in  2-dimensional  space  with  two  scenarios:  (1) 
three  receivers  and  one  emitter  (2)  four  receivers  and  one 
emitter.  Different  emitter  positions  are  considered. 

The  simulation  set-up  is  as  follows: 

•  The  measurement  error  St,  for  each  sensor  is  produced 
by  an  independent  Gaussian  distribution  with  zero  mean 
and  known  variance  a2.  The  level  of  noise  (the  standard 
deviation  a)  is  adjusted  in  the  simulation  set  as  0.05,  0.1 
or  0.15  seconds  for  TOA  measurements  ti. 

•  All  the  simulation  results  are  obtained  from  1000  Monte 
Carlo  experiments. 

•  In  the  simulations  the  bias  is  considered  as  the  average 
absolute  distance  (average  of  1000  experimental  results) 
between  the  true  emitter  position  and  the  estimated 
emitter  position.  In  the  simulation  figures  it  is  termed 
the  ‘average  absolute  distance  error’. 

2In  practice,  the  bias  is  a  vector  whose  entries  can  be  negative  or  positive. 
Here  we  only  focus  on  how  large  the  bias  is.  Therefore  the  absolute  distance 
between  the  estimated  target  position  and  the  true  position  is  used  to  evaluate 
the  bias. 
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Fie.  3.  Positions  of  emitters  and  receivers  (a)  Three  receivers  situation  (b)  Four  receivers  situation 
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Fie.  4.  Three  receivers  scenario  with  different  noise  level  (a')  cr=0.05  (b~)  cr=0.10  (cl  cr=0.15 


(a)  (b)  (c) 

Fig.  5.  Four  receivers  scenario  with  different  noise  level  (a)  cr=0.05  (b)  <7=0.10  (c)  <7=0.15 


‘Analytical  bias’  denotes  the  bias  obtained  by  using  the 
analytical  expression  derived  from  the  proposed  bias- 
correction  method. 

‘Experimental  bias’  denotes  the  bias  obtained  by  using 
simulation. 

The  scan  rate  is  radians  per  second.  Thus  one  period 
is  2.5  seconds,  and  so  the  standard  deviation  of  TOA 
measurements  is  between  2%  and  6%  of  a  period. 

The  distance  unit  used  in  the  simulations  is  kilometers 
(km). 


•  The  time  unit  and  the  angle  unit  used  in  the  simulations 
are  seconds  and  radians. 

1 )  Three  Receivers  Scenario 

In  this  situation,  three  receivers  give  rise  to  two  quasi¬ 
measurements  (f  12  and  £23)-  Therefore  the  ambient  space 
dimension  n  is  equal  to  the  number  of  measurements  N. 

Figure  3(a)  depicts  the  three  receivers  and  the  three  dif¬ 
ferent  emitter  positions.  The  three  receivers  are  located  as 
follows: 
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.  Receiver  1:  (-72.74,  -149.86) 

.  Receiver  2:  (-74.05,  -71.46) 

•  Receiver  3:  (0,  0) 

The  four  different  emitter  positions  are: 

.  Emitter  1:  (23.32,-49.98) 

.  Emitter  2:  (-17.937,-92.719) 

.  Emitter  3:  (-59.096,-135.554) 

.  Emitter  4:  (-25.364,  -72.736) 

Figure  4(a)  illustrates  the  comparison  of  the  bias  in  two 
different  situations:  without  any  bias-correction  method  (blue 
bar)  and  with  the  proposed  bias-correction  method  (red  bar). 
The  standard  deviation  of  the  measurement  error  is  0.05 
seconds  (o  =  0.05).  Evidently,  the  proposed  method  can 
reduce  the  localization  bias  by  typically  between  70%  and 
80%  for  any  of  the  four  emitter  positions. 

The  effect  of  different  adjusting  the  level  of  noise  to  0.1 
seconds  (cr  =  0.1)  and  0.15  seconds  (a  =  0.15)  is  depicted 
in  figures  4(b)  and  4(c)  respectively,  from  which  we  can 
conclude  that,  though  the  bias  is  enlarged  when  the  level 
of  noise  increases,  the  proposed  bias  correction  method  still 
performs  very  well  (after  reduction,  the  bias  is  no  more 
than  0.4  km).  The  simulation  results,  demonstrate  that  the 
proposed  bias  correction  method  is  robust  to  the  noise  level. 
2 )  Four  Receivers  Scenario 

In  this  situation,  there  are  four  receivers,  and  we  ob¬ 
tain  three  independent  quasi-measurements.  We  denote  these 
quasi-measurements  as  <12,  tx$  and  1 34.  The  ambient  space 
dimension  n  =  2  is  less  than  the  number  of  quasi¬ 
measurements,  which  corresponds  to  the  N  >  n  situation 
for  the  proposed  bias-correction,  therefore  we  use  the  method 
outlined  in  Section  III  to  introduce  an  extra  variable  which 
makes  N  =  n  again. 

Figure  3(b)  shows  the  location  of  the  four  receivers  and 
the  four  emitters  positions.  The  four  receivers  are  positioned 
as  follows: 

•  Receiver  1:  (0,  0) 

.  Receiver  2:  (158.41,-60.21) 

.  Receiver  3:  (158.29,-140.26) 

•  Receiver  4:  (0,-200.24) 

There  are  four  different  emitter  positions: 

.  Emitter  1:  (19.77,-100.12) 

.  Emitter  2:  (79.17,-100.15) 

.  Emitter  3:  (138.58,-100.21) 

.  Emitter  4:  (79.19,-80.08) 

Figure  5(a)  shows  the  simulation  results  in  2-dimensional 
space  with  three  quasi-measurements.  Again,  from  the  figure 
we  can  see  that  the  proposed  method  (the  red  bars)  reduces 
the  bias  very  effectively  (reducing  it  by  up  to  75%).  Further¬ 
more,  by  comparing  to  the  simulation  results  for  the  three 
receivers  scenario  (Figure  4(a))  we  can  see  that  introducing 
an  extra  sensor,  unsurprisingly,  improves  the  accuracy  of  the 
localization. 

Figures  5(b)  and  5(c)  illustrate  the  performance  of  the 
proposed  bias  correction  method  with  different  levels  of  noise 
(cr  =  0.1  and  a  =  0.15).  Similarly  to  the  three  receivers 
scenario,  the  proposed  method  is  effective  in  reducing  bias 
even  with  a  high  level  of  noise.  Again,  comparing  to  the 
three  receiver  scenario  (Figure  4(b)  and  Figure  4(c))  the 
accuracy  of  the  localization  is  enhanced  by  introducing  one 
more  sensor. 

V.  Conclusions 

In  previous  work  [5,  6,  7],  we  proposed  a  method  to 
determine  and  thus  correct  the  bias  in  localization  algorithms 
using  range  measurements  or  bearing  measurements.  The 
bias-correction  method  combines  Taylor  series  expansions 
and  Jacobian  matrices  to  express  the  bias  analytically.  In 


this  paper  we  present  the  bias-correction  method  in  a  more 
generic  way.  To  demonstrate  the  validity  of  our  bias  correc¬ 
tion  method,  we  have  applied  the  proposed  bias-correction 
method  to  the  scan-based  localization  problem  in  which 
the  original  method  needs  an  adjustment,  due  to  the  way 
noise  enters  the  quasi-measurements.  Monte  Carlo  simulation 
results  based  on  the  simulated  data  provided  by  DSTO  Aus¬ 
tralia’s  synthetic  integration  laboratory  (SIL)  demonstrated 
the  performance  of  the  proposed  bias  correction  method.  Our 
future  work  is  aimed  at  improving  the  performance  of  the 
proposed  method  by  using  higher  order  terms  of  the  Taylor 
series;  this  may  be  important  in  high  noise  situations. 
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Localization  of  Sensor  Networks  using  Bilateration 
and  Four-Bar  Linkage  Mechanisms 

S.  Alireza  Motevallian,  Lu  Xia,  Brian  D.O  Anderson 


Abstract — The  paper  explores  the  localization  of  large-scale 
sensor  networks  by  splitting  them  into  small  sub-networks, 
localizing  these  in  coordinate  bases  particular  to  each  subnet¬ 
work,  and  then  merging  back  the  results  to  localize  the  whole 
network  in  a  common  coordinate  basis.  Primary  focus  is  on  the 
merging  step,  and  we  extend  the  idea  of  bilateration  in  order 
to  merge  pairs  of  sub-networks.  This  involves  the  introduction 
of  a  novel  approach  based  on  the  four-bar  linkage  mechanism. 
We  show  that  the  proposed  technique  is  capable  of  merging 
localizations  of  any  pair  of  sub-networks  whose  connections  are 
sufficient  to  ensure  the  whole  network,  i.e.  subnetworks  plus 
connections,  would  be  in  principle  localizable  using  centralized 
calculations.  The  main  benefit  of  the  algorithm  is  that  it  is 
effectively  distributed,  with  very  low  computational  complexity  in 
comparison  with  many  localization  techniques  (e.g.  MDS-MAP). 
Another  important  outcome  of  this  technique  is  its  applicability  to 
a  broader  class  of  networks  than  those  treatable  with  bilateration 
or  trilateration. 

Index  Terms — Sensor  Network  Localization,  Merging-based 
localization,  Bilateration,  Four-bar  linkage  mechanism 

I.  Introduction 

In  this  paper  we  propose  a  range-based  localization  tech¬ 
nique  where  there  are  a  few  sensors  with  known  positions 
(anchors)  and  there  is  a  set  of  distance  measurements  between 
some  pairs  of  sensors.  The  network  is  modeled  by  a  graph 
called  the  grounded  graph  (details  in  Section  II- A).  Eren  et. 
al.  in  [5]  introduced  some  conditions  (called  global  rigidity ) 
on  the  grounded  graph  of  the  network  which  ensure  its  unique 
localizability.  This  enables  us  to  study  the  localizability  of 
the  network  by  using  graph  theoretical  techniques  (details  in 
Subsection  II-B). 

Generally  the  localization  techniques  are  classified  into  two 
classes  [2]:  centralized  (e.g.  MDS-MAP  [15])  and  distributed 
([12],  [10]).  While  the  Distributed  techniques  have  the  benefit 
of  local  information  exchange  and  fairly  simple  calculations 
over  centralized  techniques  (as  discussed  in  [1]  the  localization 
problem  is  NP-Hard)  with  the  cost  of  requiring  more  distance 
measurements  and  anchors.  In  most  of  the  available  distributed 
localization  schemes  the  grounded  graph  of  the  network  is 
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assumed  to  be  trilaterative,  i.e.  each  node  knows  its  distance 
to  at  least  three  neighbors. 

In  this  work  we  adopt  a  merging/splitting  strategy  in  which 
the  whole  network  is  split  into  small  sub-networks  each 
of  which  yet  localizable.  These  small  sub-networks  can  be 
efficiently  localized  with  existing  techniques  in  their  local 
coordinate  basis  and  then  with  the  use  of  an  appropriate  and 
efficient  merging  technique  we  will  be  able  to  find  the  position 
of  all  the  nodes  in  global  coordinates.  Due  to  the  small  size 
of  the  sub-networks  this  technique  is  totally  distributed.  A 
similar  idea  has  been  used  in  [14]  to  improve  the  performance 
of  MDS-MAP  by  computing  local  maps  and  then  merging 
them  back  together. 

Trilateration-based  techniques  are  only  applicable  to  dense 
networks  where  there  are  many  more  known  distances  than 
actually  required  for  a  network  to  be  uniquely  localizable.  In 
[16]  it  is  explained  that  trilateration  techniques  fail  to  localize 
the  boundary  nodes  while  in  some  sensor  network  applications 
like  intrusion  detection  and  coverage,  these  nodes  have  the 
most  important  and  valuable  information  among  the  sensors. 
A  much  sparser  and  broader  class  of  networks  which  still  can 
be  localized  in  a  distributed  manner  (i.e.  beyond  trilateration 
networks)  is  the  bilateration  networks  introduced  in  [8],  [7].  In 
these  networks  the  position  of  each  node  is  derived  from  the 
distances  to  at  least  two  (instead  of  at  least  three  neighbors 
required  for  the  definition  of  trilateration  networks)  neighbors 
with  known  positions  (details  in  Section  II).  Although  every 
trilateration  network  is  a  bilateration  network,  some  bilater¬ 
ation  networks  do  not  necessarily  have  unique  realizations 
(no  algorithm  can  find  a  unique  position  for  every  node). 
Therefore,  among  these  networks  we  are  interested  in  uniquely 
localizable  ones  which  we  call  bilaterative  localizable. 

We  combine  the  idea  of  splitting/merging  with  the  above 
mentioned  bilateration-based  techniques  to  not  only  propose  a 
distributed  localization  technique  but  also  a  technique  which  is 
applicable  to  a  broader  class  of  networks  known  as  distribut- 
edly  localizable.  The  only  assumption  is  that  the  network  is 
splittable  into  some  localizable  sub-networks  which  may  or 
may  not  share  some  nodes,  and  so  that  there  are  enough  links 
between  pairs  of  sub-networks  to  ensure  the  localizability  of 
the  union.  The  merging  technique  proposed  here  sequentially 
merges  pairs  of  the  sub-networks  into  larger  sub-networks  until 
the  whole  network  results.  To  accomplish  this  merging  task, 
we  mainly  use  the  idea  of  bilateration.  However,  we  further 
enrich  it  by  incorporating  the  novel  idea  of  four-bar  linkage 
mechanisms  from  geometry  to  tackle  the  possible  merging 
configuration  where  even  the  bilateration  techniques  fail  (this 
configuration  is  described  as  unsolvable  in  [8]). 

Section  II  introduces  the  background  techniques,  theories 
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and  notations  we  will  be  using  throughout  the  paper.  Section 
III  gives  a  clear  definition  of  the  problem  explaining  the 
distinct  cases  that  need  to  be  addressed  whereas  Section  IV 
provides  solutions  for  them.  Section  V  introduces  some  further 
discussions  about  the  algorithm  and  compares  the  class  of  the 
networks  addressed  by  this  algorithm  with  other  classes  that 
have  already  been  solved.  We  finally  give  concluding  remarks 
in  Section  VI. 

II.  Background 

The  main  focus  of  the  paper  is  on  localization  in  a  2D 
ambient  space,  but  wherever  it  is  possible  we  provide  a  general 
definition  valid  for  any  d-dimensional  ( d  £  {2,3})  ambient 
space. 

A.  Network  Abstraction 

To  model  a  network  we  first  need  to  introduce  the  notion 
of  frameworks.  The  grounded  graph  of  the  network  is  a  graph 
G  =  ( V.  E)  where  each  vertex  v  £  V  corresponds  to  a  node 
and  there  is  an  edge  (vi^vf)  £  E  if  the  distance  between  the 
nodes  corresponding  to  v-\  and  v->  is  known.  We  define  n  = 
\V\  to  simplify  the  notation.  A  configuration  p  =  {p1;  ...,pn} 
is  a  finite  collection  of  n  labeled  points  in  Rd  where  d  is 
the  dimension  d  £  {2,3}.  A  framework  F  =  ( G,p )  is  a 
graph  G  =  ( V ,  E )  together  with  a  corresponding  configuration 
p  =  {/)-| ,  ...,pn}  in  Rd.  In  this  framework,  p  is  the  mapping 
p  :  V  — >  Rd  where  each  vertex  v,  £  V,  i  =  l..n  is  assumed 
to  be  located  at  a  corresponding  point  pi  £  p.  Sometimes  this 
framework  is  called  a  realization  of  the  graph  G. 

In  the  context  of  network  localization,  a  network  of  n  sen¬ 
sors  therefore  can  be  modeled  by  the  framework  F  =  (G.  p) 
where  G  =  ( V,  E)  is  the  grounded  graph  of  the  network  and 
p  is  a  mapping  p  :  V  — >  Rd.  We  use  the  terms  framework 
and  network  interchangeably  throughout  the  paper. 

B.  Global  Rigidity 

We  say  that  two  frameworks  f  \  =  (G,  p)  and  Tj  =  (G,  q) 
are  equivalent  (shown  by  F\  =  Ff)  if  for  all  pairs  i.  j  where 
(i,j)  £  E,  | /A  —  p:j  ||  =  \\qt  —  (}j\\-  Two  configurations  p  and 
q  are  called  congruent,  denoted  by  p  =  q  if  for  all  pairs 
i,j  £  V,  || pi  —  Pj ||  =  || qt  —  qj ||  [4].  We  say  that  a  framework 
( G,p )  is  globally  rigid  in  Rd  if  (G,  p)  =  ( G,q )  implies 
p  =  q  for  any  generic  configuration  q  with  \q\  =  \p\.  A 
configuration  is  generic  if  the  coordinates  of  the  points  are 
algebraically  independent  from  each  other.  In  other  terms,  a 
graph  G  is  said  to  be  globally  rigid  if  any  set  of  its  equivalent 
frameworks  (G,  p)  can  be  obtained  by  the  translation,  rotation 
or  reflection  of  the  whole  framework. 

The  framework  G(p)  is  rigid  in  Rd  if  there  exists  an 
e  >  0  such  that  for  any  other  configuration  q  in  Rd  where 
||p  —  q||  <  e,  G(p)  =  G{q)  implies  G(p)  =  G(q).  Roughly 
speaking,  a  framework  G(p)  is  called  rigid  if  by  fixing  the  set 
of  distances  corresponding  to  the  edges  in  G,  the  formation 
does  not  have  any  flex  motions  [2], 

The  following  theorem,  originally  from  [6]  (Theorem  1), 
states  the  necessary  and  sufficient  conditions  for  a  network  to 
be  uniquely  localizable.  By  uniquely  localizable  we  mean  that 
it  is  possible  to  uniquely  identify  the  position  of  each  node. 


Theorem  1.  Suppose  that  F  =  (G,  p)  is  a  framework  in 
Rd  modeling  a  given  network  of  at  least  d  +  1  nodes.  The 
framework  (and  therefore  the  network  it  models)  F  is  uniquely 
localizable  if  and  only  if  G  is  globally  rigid  and  there  are  at 
least  d  +  1  anchors  located  in  generic  positions. 

We  say  that  the  graph  G  is  (globally)  rigid  in  Rd  if  its 
generic  realizations  in  Rd  are  (globally)  rigid.  It  is  important 
to  mention  that  either  all  generic  realizations  of  a  graph  are 
(globally)  rigid  or  all  are  non-(globally)  rigid.  Therefore,  the 
(global)  rigidity  property  is  independent  of  the  configuration 
of  a  framework  if  we  restrict  the  configuration  space  to 
only  generic  ones.  Global  rigidity  has  a  fully  combinatorial 
characterization  in  2D  which  can  be  tested  with  an  efficient 
algorithm  [6], 

C.  Merging  localizable  sub-networks 

We  now  restrict  attention  to  the  case  d  =  2.  Merging  is 
the  task  of  introducing  a  set  of  links  between  two  localizable 
networks  in  order  to  form  a  single  localizable  network.  This 
has  been  studied  in  detail  in  [5],  [17].  In  terms  of  global 
rigidity,  this  problem  is  equivalent  to  merging  two  globally 
rigid  graphs  by  adding  a  set  of  edges  so  that  the  union  is  a 
single  globally  rigid  graph.  It  is  possible  that  the  two  networks 
share  some  nodes.  However,  as  shown  in  [17],  these  are 
special  cases  of  the  situation  where  there  is  no  common  node. 
Therefore,  from  now  on  we  assume  that  the  sub-networks  are 
node-disjoint  unless  explicitly  specified.  Later  in  Subsection 
IV-D  we  address  the  case  where  the  sub-networks  have  some 
common  nodes. 

In  general,  a  sub-network  may  have  any  number  of  nodes. 
However,  to  address  the  general  case,  we  also  assume  that  each 
sub-network  contains  at  least  3  vertices.  Later  in  Subsection 
IV-E,  we  address  the  scenario  when  one  of  the  sub-networks 
has  only  1  or  2  nodes. 

The  following  theorem  discusses  the  necessary  (as  explained 
in  [17])  and  sufficient  (as  proved  in  [5])  conditions  for  such  a 
merging  to  happen,  in  terms  of  global  rigidity. 

Theorem  2.  Suppose  that  two  disjoint  globally  rigid  graphs 
Gi  =  (Vi,.Ei)  and  G2  =  {Vi^Ef)  with  |L}|  >  3,  are 
connected  by  a  set  of  links,  denoted  by  L.  Then  G  =  G 1  U 
G2  U  G(L)  is  globally  rigid  if  and  only  if  the  following  two 
conditions  hold: 

1.  I  Vi  n  V(L)\  >  3,  i  =  1,2. 

2.  \L\  >  4. 

where  G(L)  is  the  graph  induced  by  the  edge  set  L  and 
V  ( L )  is  the  set  of  end  vertices  of  the  edges  in  L. 

This  theorem  only  provides  the  conditions  to  ensure  that  the 
merging  leads  to  a  globally  rigid  graph  but  does  not  provide 
a  constructive  solution  to  the  localization  problem.  We  will 
address  this  by  providing  a  distributed  algorithm  for  merging 
two  uniquely  localizable  sub-networks. 

D.  Bilateration  and  Trilateration 

Most  of  the  distributed  localization  techniques  proposed  so 
far  assume  that  the  underlying  network  has  a  trilateration  or¬ 
dering.  However,  as  mentioned  earlier,  there  is  a  broader  class 
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Figure  1.  A  traditional  four-bar  mechanism  problem  setup.  A  and  B  are 
nodes  with  fixed  positions. 

of  the  networks  with  bilateration  ordering  (called  bilaterative 
networks),  which  can  be  uniquely  localized  using  a  distributed 
algorithm.  In  addition,  the  class  of  trilaterative  networks  is 
a  proper  subset  of  the  class  of  bilaterative  networks.  In 
this  section  we  provide  a  formal  definition  for  bilateration 
(trilateration)  ordering.  A  network  is  said  to  have  bilateration 
(trilateration)  ordering  if  its  grounded  graph  has  such  an 
ordering. 

Definition  3.  A  graph  G  =  {V,E)  with  \V\  >  3  (|V|  >  4) 
has  a  bilateration  (trilateration)  ordering  if  its  vertices  can  be 
ordered  as  v\,..,vn  (n  =  |Vj)  so  that  the  subgraph  induced 
by  v\,v2,  V3  is  a  triangle  and  each  Vi,  3  <  i  <  n  (4  <  i  <  n) 
is  connected  to  at  least  two  (three)  vertices  Vj,  j  <  i  [7]. 

The  bilateration  operation  is  the  operation  of  localizing 
the  position  of  a  node  up  to  a  set  of  finite  possibilities  by 
considering  its  distances  to  at  least  two  neighbors.  This  can 
be  done  by  solving  the  equations  obtained  from  these  distance 
constraints  (intersection  of  multiple  circles  each  one  around  a 
neighbor)  in  addition  to  the  position  (global  or  relative)  of 
those  neighbors. 

E.  Four-bar  linkage  Mechanism 

We  will  use  the  idea  of  four-bar  linkage  mechanisms  in  this 
paper  to  explore  one  of  the  merging  scenarios  which  had  been 
described  in  [8]  as  unsolved  by  bilateration-based  localization 
techniques.  The  four-bar  linkage  mechanism  has  been  used  in 
solving  a  localization  problem  in  [13].  To  cope  with  the  space 
limitations  we  introduce  the  mechanism  very  briefly.  Refer  to 
[3]  for  further  details. 

Figure  1  shows  a  typical  four-bar  linkage.  Assume  that  the 
vertices  A-E  are  joints  and  the  edges  between  them  are  bars 
with  fixed  length.  If  we  fix  the  position  of  the  vertices  A  and 
B,  and  allow  the  other  nodes  to  move  freely  in  the  plane,  the 
position  of  the  node  D  will  follow  a  curve  called  a  coupler 
curve  [3],  [13].  Assuming  that  joint  A  is  the  origin  and  joint 
B  is  on  the  positive  side  of  x-axis,  this  curve  can  be  explicitly 
expressed  by  the  degree-six  equation  (1)  where  (x,y)  is  the 
position  of  the  vertex  D  [11]: 

a 2  (( x-k )2  +  y 2)  ( x 2  +  y2  +  b'2-r2)  2 
-2 ab  ( ( x 2  +  y2-kx )  cos  77  +  ky  sin  77) 
x  ( x 2  +  y2  +  b2-r 2)  {{x-k)  2  +  y2  +  a2-R 2)  (1) 

+b2  {x2  +  y2)  {{x-k)  2  +  y2  +  a2-R 2)  2 
-4 a2b2  {{x2  +  y2-kx )  sin r]-ky  cos rj) 2  =  0 

III.  Problem  Statement 

Assume  that  we  have  a  large  network  which  has  already 
been  split  into  small  sub-networks  with  the  condition  that  each 


of  them  is  uniquely  localizable  up  to  at  least  a  local  coordinate 
basis,  i.e.  the  grounded  graph  of  each  sub-network  is  globally 
rigid.  As  mentioned  before,  we  assume  that  each  sub-network 
has  at  least  3  nodes. 

We  are  interested  in  a  step-by-step  merging  process  in  which 
each  sub-network  is  merged  into  one  of  its  neighboring  sub¬ 
networks  to  form  a  larger  single  post-merged  localizable  sub¬ 
network  and  this  merging  continues  until  the  whole  network 
results.  We  will  not  discuss  the  conditions  required  for  a 
network  to  be  splittable  in  the  above  manner.  However,  since 
any  trilaterative  network  can  be  treated  as  special  case  of 
merging  (as  explained  in  Subsection  IV-E  if  one  of  the  sub¬ 
networks  contains  only  1  node,  the  merging  configuration 
reduces  to  a  trilateration  configuration)  and  there  are  merging 
configurations  which  are  not  trilaterative,  this  class  is  still 
strictly  larger  than  the  class  of  trilateration  networks  (an 
example  is  explained  in  [16]).  It  is  also  only  required  that 
there  are  at  least  3  anchors  somewhere  in  the  network  (not 
necessarily  lying  in  the  same  sub-network).  We  also  assume 
that  the  distance  measurements  are  accurate  and  there  is  no 
noise  involving  in  the  estimated  distances.  One  immediate 
result  of  these  assumptions  is  that  the  distances  between  all 
pairs  of  nodes  in  any  localized  sub-network  are  known  (we 
assume  that  every  node  in  the  sub-network  knows  the  location 
information  of  the  others). 

Based  on  these  assumptions,  in  each  step  of  the  algorithm, 
we  take  the  coordinate  basis  of  a  fixed  sub-network  as  the 
reference  coordinate  basis  and  then  merge  a  neighboring  sub¬ 
network  (already  localized  in  its  local  basis)  with  this  sub¬ 
network.  This  results  in  a  post-merged  sub-network  (localized 
in  the  reference  coordinate  basis).  In  further  steps,  we  merge 
other  sub-networks  into  the  already  localized  (up  to  the 
reference  basis)  sub-networks  in  a  similar  manner.  The  main 
step  of  the  process  (merging  two  neighboring  localizable  sub¬ 
networks)  can  then  be  formally  defined  as  follows: 

Assume  there  are  two  sub-networks  F\  =  (G-\  -  P\  )  and 
F2  =  ( G 2 ,  P2  )  where  both  have  been  localized  by  an  arbitrary 
localization  algorithm.  F\  has  been  localized  in  the  reference 
basis  and  F2  is  localized  in  its  local  basis.  Also  assume  that 
there  are  enough  links  (and/or  common  nodes)  between  Ft 
satisfying  the  conditions  of  Theorem  2.  Apply  a  merging 
technique  to  localize  the  nodes  in  F2  into  the  reference 
coordinate  basis. 

If  there  are  three  or  more  vertices  in  F2,  once  the  three 
or  more  nodes  involving  in  the  merging  are  localized,  the 
other  nodes  in  F2  can  be  localized  by  simply  computing  the 
transformation  matrix  between  the  reference  coordinate  basis 
and  the  local  basis  of  F2. 

A.  Possible  configurations 

According  to  Theorem  2,  there  must  be  at  least  four  distance 
measurements  between  f )  and  F>  in  order  for  a  merging 
algorithm  to  succeed.  It  suffices  to  only  consider  four  links 
between  the  F,;  (\L\  =  4)  as  in  the  absence  of  noise,  four 
distance  measurements  are  enough  to  carry  out  the  merging. 
If  there  are  more  than  four  links,  we  just  pick  four  of  them 
and  proceed  with  the  merging  using  those  links  (in  Subsection 
V-A  we  discuss  the  use  of  those  extra  links  to  lower  the 
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Figure  2.  (a)  3-by-3  and  (b)  4-by-3  configurations,  n ^  is  the  kth  node  in 

the  bilateration  ordering. 

computational  cost  of  the  algorithm).  As  it  is  necessary  for 
each  Fi  to  involve  at  least  three  vertices,  there  are  4  possible 
configurations  if  the  Fi  do  not  share  a  common  vertex: 

ir\vlnv(L)\  =  3- 

2)  j  14  D  V(L) I  =  4  and  \V2  n  V{L)\  =  3; 

3)  j  14  n  V(L)  I  =  3  and  |1/2  n  V(L)  j  =  4; 

4)  \Vi  nV(L)\  =  4. 

Allowing  the  Fi  to  share  some  nodes,  results  in  three  different 
scenarios  (the  solution  to  these  cases  will  be  discussed  in 
Subsection  IV-D): 

1)  1 14  n  14 1  =  1  and  \L\  =  2  (1  node  in  common  and  2 
links  joining  the  f’,j; 

2)  1 14  n  14 1  =2  and  \L\  =  1  (2  nodes  in  common  and  1 
link  joining  the  F’,); 

3)  Vj  H  14  >  3  and  \L\  =  0  (  3  or  more  nodes  in 

common). 

IV.  Solution 

In  this  section,  we  explain  how  the  bilateration  technique 
and  four-bar  linkage  mechanism  can  be  used  to  merge  the  f). 
We  address  the  merging  for  all  different  configurations  case  by 
case,  starting  from  cases  where  there  is  no  node  in  common. 

A.  Tliree-By-Three 

In  the  first  possible  configuration,  three  nodes  in  each  sub¬ 
network  are  joined  to  each  other  by  the  four  links  in  L  (Figure 
2a).  As  mentioned  before,  all  the  three  nodes  in  F\  are  already 
localized  into  the  reference  basis  and  acting  as  anchors.  We 
apply  the  bilateration  operation  to  localize  the  nodes  in  F> 
into  the  reference  coordinates. 

Figure  2a  shows  an  example  of  a  three -by-three  merging.  In 
this  case  there  is  always  a  node  in  F2  which  is  connected  to 
exactly  two  nodes  in  F\.  Applying  the  bilateration  operation 
(using  these  links)  gives  two  possible  positions  for  this  node. 
For  example  in  Figure  2a,  node  773  is  adjacent  to  ni,n2  €  14 
and  therefore  using  bilateration,  we  can  derive  two  possible 
positions  for  it.  Following  this  figure,  the  node  774  is  connected 
to  713  and  ri2  and  hence  by  further  applying  bilateration,  we 
obtain  up  to  4  possible  positions  for  it.  Following  this  process, 
eight  possible  positions  are  derived  for  node  715  using  the 
connections  to  713  and  714.  However,  the  distance  between  17,5 
and  776  is  also  known,  and  generic  global  rigidity  of  the  whole 
network  implies  that  only  one  of  the  eight  possibilities  is  valid. 
After  finding  the  unique  position  of  775,  we  can  backtrack  and 
identify  the  unique  positions  of  774  and  773  as  well,  by  barring 
inconsistent  position  candidates. 

B.  Four-by-Three  &  Three-by-Four 

Similar  to  the  three-by-three  configuration,  in  the  four-by- 
three  case  there  is  a  node  in  F2  which  is  adjacent  to  two  nodes 


Figure  3.  4-by-4  Configuration.  Bilateration  fails  to  extend  from  left  to  right. 


a  b 

Figure  4.  Two  possible  4-bar  linkages  in  a  4-by-4  configuration.  Each  node 
77-j  ,1  —  7,8  lies  on  the  intersection  of  a  coupler  curve  and  a  circle. 

in  F\  and  therefore  can  be  localized  up  to  a  finite  number  of 
possible  positions  using  the  bilateration  operation.  Other  steps 
are  exactly  the  same  as  in  the  three-by-three  case  (Figure  2b). 

In  the  three-by-four  case,  there  is  no  bilateration  ordering 
which  is  initiated  in  Fj  and  grows  towards  F2.  However,  with 
a  simple  device  we  can  still  obtain  a  bilateration  ordering. 
Instead  of  starting  from  Fj ,  we  start  from  h\  as  if  it  were 
the  sub-network  localized  into  the  reference  basis.  A  similar 
algorithm  as  in  the  four-by- three  case  will  lead  to  obtaining 
unique  locations  of  the  nodes  in  Fj  in  the  coordinate  basis  of 
F2.  Since  the  actual  positions  of  the  nodes  in  f  j  are  known  a 
priori  (up  to  the  reference  coordinates),  we  can  obtain  the 
transformation  matrix  to  convert  the  local  positions  of  the 
nodes  in  F2  into  the  reference  coordinates. 

C.  Four-by-Four 

In  this  case  four  nodes  in  Fj  and  four  nodes  in  F2  are 
connected  via  the  four  links  (Figure  3).  Note  that  in  the  figure, 
the  edges  joining  the  4  vertices  in  Fi  and  in  F2  may  be 
virtual  edges,  i.e.  edges  whose  lengths  become  known  after 
localizing  within  the  particular  Fj.  The  bilateration  technique 
we  have  used  so  far  fails  here  as  there  is  no  node  in  F2 
with  more  than  one  neighbor  in  Fj  and  bilateration  ordering 
cannot  be  extended  from  Fj  to  F2.  In  [13],  the  authors 
solve  a  similar  problem  using  the  semi-definite  programming 
(SDP)  technique.  However,  this  requires  a  high  computational 
capability  which  may  be  beyond  the  processing  capabilities 
of  most  of  the  sensors.  Instead,  we  incorporate  the  four- 
bar  linkage  technique  introduced  in  II-E  which  can  be  easily 
carried  out  by  sensors  with  low  computational  capabilities. 

1 )  Four-bar  linkage  technique:  This  technique  has  been 
previously  used  in  [13]  to  address  some  localization  problems 
differing  from  the  subject  of  this  paper.  Figure  4  shows  how 
the  four-bar  linkage  mechanism  can  be  used  to  solve  the 
merging  problem.  The  four  links  joining  the  sub-networks  of 
Figure  3  can  actually  form  two  separate  four-bar  linkages  as 
shown  in  Figure  4.  According  to  [3],  the  structure  in  Figure 
4a  (Figure  4b)  leads  to  a  coupler  curve  for  node  777  (?7§).  In 
both  cases,  the  nodes  777,  n2  are  the  fixed  bases  in  the  four-bar 
linkage  mechanism. 
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Assume  that  ni  is  at  the  origin  and  712  lies  on  the  positive 
side  of  the  .r-axis.  The  equation  for  the  position  of  the  node 
777,  shown  by  I\{x,y),  can  be  obtained  by  substituting  k  = 
lo,r  =  li,R  =  I2,  a  =  le,b  =  I5  in  Equation  1.  There  is  also 
another  coupler  curve,  called  K'(x,  y),  which  can  be  obtained 
by  mirroring  K(x,y)  against  l0  (  K'(x,y )  =  K(x,—y)). 

The  position  of  the  node  717  must  also  satisfy  the  circle 
equation  (shown  by  Cs{x,y))  corresponding  to  link  (3  and 
the  position  of  n 3: 

(x  ~  X3)2  +  {y  ~  z/3)2  =  ll 

Therefore  the  position  of  717  must  be  on  the  intersection  of 
the  two  curves  K(x,y)  and  Cs{x,y).  The  real  intersections 
of  the  curves  can  be  obtained  easily  by  using  any  numerical 
technique  (details  are  removed  to  cope  with  space  limitation). 
Although  in  general,  the  number  of  solutions  for  a  system 
comprising  a  degree-6  equation  and  a  quadratic  equation  can 
be  up  to  12,  surprisingly  it  has  been  shown  in  [3]  that  the 
intersection  of  above  equations  can  have  at  most  six  possible 
solutions.  Intersecting  K'(x,y)  and  Csix,y)  also  gives  at 
most  six  other  possible  positions  for  717.  Therefore,  we  may 
have  a  total  of  maximum  12  possible  positions  for  717.  We  will 
denote  this  solution  set  by  P  =  p\,  ...,ps,  2  <  s  <  12. 

By  a  similar  procedure,  we  will  obtain  a  maximum  of  12 
numerical  values  for  the  position  of  node  ns,  denoted  by  Q  = 
{qi,  q2, ...,  qt} ,  1  ^  t  ^  12).  Since  the  distance  lg  between 
777  and  7i8  is  known,  for  each  i  £  2, ...,  s  and  j  £  2, ...,  t  we 
calculate  \\pi  —  qj\\  and  compare  it  with  lg.  If  the  configuration 
of  the  network  is  generic  and  the  measurements  are  exact  (no 
noise),  then  out  of  these  s  x  t  possibilities  for  (i,j),  there 
is  exactly  one  pair  of  positions  for  nr,  ns  which  is  consistent 
with  the  distance  lg.  Therefore,  the  distance  lg  will  resolve  the 
unique  position  of  nodes  77.7  and  ns-  These  unique  positions 
can  help  to  resolve  the  unique  position  of  the  nodes  717,7  =  5,  6 
as  well. 

The  apparent  downside  of  this  technique  is  that  there  may 
be  up  to  144  different  possible  (i,  j)  pairs  which  is  a  large 
number  and  the  presence  of  any  noise  can  lead  to  a  wrong 
decision  about  the  correct  positions.  However,  this  is  only  the 
worst  case  theoretical  bound  and  it  is  important  to  study  the 
average  number  of  possibilities  in  real  scenarios.  We  studied 
this  property  in  random  networks,  where  the  positions  of 
the  nodes  are  drawn  from  a  uniform  distribution  in  2D.  To 
do  so,  we  split  the  unit  square  area  into  two  same-sized 
rectangles  by  a  vertical  line.  Each  side  includes  4  randomly 
located  nodes  representing  F,.  i  -1.2.  Then,  the  coupler  and 
circle  equations  are  derived  and  solved  to  obtain  the  possible 
positions  of  717  and  ns-  This  whole  process  is  repeated  20 
times  to  get  the  average  number  of  possible  intersections. 

As  can  be  seen  in  Table  I,  the  average  intersection  count  is 
4.  This  states  that  the  total  number  of  possibilities  for  (i,j)  is 
expected  to  be  16  which  is  far  less  than  the  theoretical  worst 
case  of  144  obtained  earlier.  It  is  also  worth  mentioning  that 
in  our  simulations,  we  never  encountered  a  case  where  12 
intersections  occurred  and  the  worst  case  produced  was  8  for 
one  node. 


Table  I 

Number  of  intersections  between  the  circle  and  the  coupler 

CURVE  IN  A  RANDOM  EXPERIMENT  OF  SAMPLE  SIZE  20 


Property 

mean 

min 

max 

standard  deviation 

Intersections 

4 

2 

8 

1.9 

Figure  5.  Common  vertices:  (a)  one  common  vertex;  (b)  two  or  more 
common  vertices 

D.  Sub-networks  with  Common  Nodes 

Figure  5a  shows  an  example  where  1  node  (rii)  is  shared 
between  the  Fj.  Since  n\  is  in  F\,  ns  is  connected  to  two 
vertices  from  F\  and  therefore  we  can  apply  the  bilateration 
operation  to  find  2  possible  positions  for  it.  Again  four  possible 
positions  are  obtained  for  774  using  the  bilateration  operation 
from  the  nodes  774  and  773.  Since  the  distance  between  774,775 
is  known,  it  can  resolve  the  unique  position  of  774  and  with  a 
backtracking  the  unique  position  of  773.  If  there  is  more  than 
one  vertex  in  common,  the  bilateration  reduces  to  a  simple 
trilateration.  For  example  in  Figure  5b,  773  is  connected  to 
?7i,  772, 774  all  from  F\  and  therefore  can  be  uniquely  localized. 

E.  Merging  a  sub-network  with  1  or  2  nodes 

In  this  section  we  provide  a  solution  to  the  problem  of 
merging  when  the  condition  of  Theorem  2  is  not  satisfied  and 
one  of  the  F,  has  at  most  two  nodes.  Without  loss  of  generality, 
let  us  assume  that  \F-t  >3  and  | |  <  2.  As  is  shown  in  [9],  a 
necessary  condition  for  a  graph  to  be  globally  rigid  is  that  the 
graph  is  3-connected.  This  implies  that  \Vi  (T  V(L)\  >  3  (at 
least  three  vertices  in  F\  must  be  connected  to  the  vertices 
in  Fg)  or  otherwise  the  post-merged  graph  will  not  be  3- 
connected. 

f  2  has  one  vertex:  As  Figure  6a  shows  it  is  enough  that 
the  vertex  in  F2  is  connected  to  at  least  3  vertices  in  F\ . 
This  is  simply  the  well-known  trilateration  operation  and  the 
intersection  of  the  three  circles  around  the  nodes  n, ,  i  =  1..3 
can  uniquely  identify  the  position  of  node  774. 

F2  has  two  vertices:  Since,  the  degree  of  each  vertex 
77,,  7  =  1,2  in  F2  is  1,  each  77,;  must  also  be  connected  to 
at  least  two  vertices  in  F\.  This  simply  means  that  we  can 
apply  the  bilateration  operation  starting  from  Fi  and  localize 
the  position  of  each  77,.  Figure  6b  shows  an  example  of  such  a 


Figure  6.  F2  has  1  or  two  vertices:  (a)  |i<2|  =  1;  (b)  |F2|  =  2 
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Figure  7.  Classification  of  network  topologies  localizable  by  different 
techniques.  Four-bar  mechanism  enables  us  to  localize  networks  in  which 
bilateration  based  algorithms  fail. 

configuration  to  explain  how  the  localization  algorithm  works 
in  this  case.  First  by  using  the  distances  from  713  and  71.4 
to  m,  we  find  two  possible  positions  for  n\.  Similarly,  two 
positions  obtained  for  712  (by  using  the  distances  between  n 2 
and  ri5,ne).  Then  the  actual  distance  between  ni  and  712  can 
be  used  to  resolve  their  unique  positions. 

V.  Discussions 

A.  Effect  of  additional  links  on  computational  complexity 

In  reality,  it  is  often  the  case  that  we  have  more  known 
distance  constraints  than  the  minimum  required.  Therefore,  it 
is  relevant  to  find  out  the  effect  of  additional  links  on  the 
computational  complexity  of  the  algorithm.  However,  if  there 
are  more  links  than  four  (as  so  far  assumed)  between  the  F,, 
the  chance  of  having  a  bilateration  ordering  extending  from 
F\  to  F'2  obviously  increases.  Hence,  the  more  links  we  have, 
the  higher  the  possibility  is  that  the  problem  can  be  solved 
by  bilateration.  As  bilateration  techniques  are  computationally 
simpler  than  the  4-bar  linkage  algorithm,  we  conclude  that 
increasing  the  number  of  links  between  the  two  sub-networks 
reduces  the  complexity  of  the  algorithm  with  a  high  probabil¬ 
ity. 

B.  Classification  of  localizable  networks 

Figure  7  shows  different  categories  of  localizable  networks. 
Trilateration  networks  have  been  the  target  of  most  of  the 
distributed  algorithm  as  the  calculations  are  fairly  simple  in 
these  networks.  However,  these  networks  are  rather  dense  and 
might  be  considered  a  small  subclass  of  uniquely  localizable 
networks.  Bilateration  networks  on  the  other  hand,  form  a 
broader  class  of  networks  for  which  some  efficient  distributed 
localization  algorithms  exist.  The  downside  is  that  there  may 
be  a  need  for  much  more  memory  space  than  trilateration 
techniques. 

By  introducing  the  four-bar  linkage  technique,  we  have 
addressed  a  broader  class  of  networks  than  bilateration  which 
still  can  be  localized  by  efficient  algorithms  (Figure  7).  Notice 
that  there  are  still  algorithms  that  can  localize  all  networks 
with  globally  rigid  grounded  graphs  but  they  require  central 
calculations. 

We  should  mention  that  it  is  still  not  known  that  all  the 
networks  that  can  be  localized  in  a  distributed  manner  are 
localizable  by  the  technique  proposed  in  this  paper  and  this  is 
the  subject  of  further  studies. 

VI.  Conclusion  and  future  work 

In  this  paper  we  proposed  an  efficient  distributed  localiza¬ 
tion  technique  based  on  the  idea  of  splitting  large  networks 
into  small  ones,  localizing  those  small  sub-networks  and  merg¬ 
ing  them  to  obtain  the  wholly  localized  network.  The  technique 


proposed  could  address  all  possible  merging  configurations 
while  keeping  the  computational  complexity  very  simple.  The 
algorithm  adds  the  idea  of  four-bar  linkage  mechanism  to 
bilateration-based  techniques  in  tackling  the  problem.  This  led 
to  further  broadening  the  class  of  the  networks  that  can  be 
distributedly  localized  and  there  for  outperforming  most  of 
the  existing  distributed  techniques  which  requires  the  network 
to  be  trilaterative. 

While  we  believe  that  some  distributed  splitting  techniques 
(e.g.  distributed  clustering  techniques  proposed  for  sensor 
networks)  can  perform  a  splitting  suitable  for  the  proposed 
method  in  this  paper,  splitting  is  still  the  subject  of  our  future 
work.  We  are  also  studying  the  effect  of  the  node  density 
of  a  random  network  on  the  success  rate  of  such  splitting 
techniques.  In  addition,  it  is  still  open  to  investigate  the 
effect  of  the  noise  in  the  accuracy  of  the  technique  as  if  the 
anchor  density  is  low,  it  may  cause  a  considerable  amount  of 
noise  to  propagate  throughout  the  network.  Finally,  it  is  also 
presumably  possible  to  extend  this  algorithm  in  some  way  to 
3D  ambient  space,  drawing  probably  on  ideas  of  [17]. 
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Abstract:  Localisation  is  a  vital  problem  in  a  multitude  of  research  fields,  such 
as  navigation,  tracking,  sensor  networks  and  so  on.  In  previous  work,  the 
problem  is  considered  in  the  plane  or  in  three-dimensional  space.  This  work 
deals  with  the  problem  of  distance-based  localisation  on  the  surface  of  the  earth 
when  the  points  lie  in  a  two-dimensional  manifold.  The  challenge  lies  with 
finding  an  appropriate  technique  to  cope  with  noisy  measurements  when  the 
conventional  formulation  for  a  planar  model  cannot  be  used.  To  this  end,  we 
adopt  a  tool  recently  applied  to  the  planar  model,  the  Cayley-Menger  matrix. 
Simulation  results  show  that  the  proposed  method  is  effective  and  robust  to 
noise.  We  also  quantify  the  effect  of  a  planar  approximation. 
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1  Introduction 

Target  tracking  and  navigation  have  valuable  applications  in  defence,  and  for  any  type  of 
target  tracking  and  navigation,  target  localisation  is  critical  (Danchik,  1988;  Healey  et  al., 
1995;  Poisel,  2005;  Sabour  et  al.,  2008;  Tai  and  Bo,  2009).  The  goal  of  localisation  is  to 
find  the  location  of  a  target  based  on  a  number  of  measurements  from  sensors  at  known 
locations.  The  target  can  be  an  unmanned  aerial  vehicle  (UAV),  a  robot,  and  even  some 
sort  of  information  that  can  be  sensed.  Sensors  include  vehicles,  radar  stations,  or  any 
other  objects  at  known  locations  and  assisting  the  localisation  of  the  target.  The 
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measurements  may  be  of  different  kinds,  e.g.,  distance  or  time  or  arrival,  time  difference 
of  arrival  (TDOA),  angle  of  arrival  and  so  on.  We  consider  here  the  use  of  distance 
measurements.  Our  interest  will  be  localisation  on  the  earth  without  GPS,  with  long 
distances  involved.  The  interesting  problems  are  those  where  there  is  noise  contaminating 
the  distance  measurements.  To  avoid  ambiguities,  three  or  more  measurements  are 
needed,  and  then  the  question  arises  of  how  to  allow  for  the  presence  of  noise  in  those 
measurements. 

For  conventional  localisation  problems  in  two-  and  three-dimensional  space,  a  recent 
paper  (Cao  et  al.,  2006)  has  shown  that  an  entity  formed  from  the  distances  between  the 
sensors  and  the  distances  between  the  sensors  and  the  target,  termed  the  Cayley-Menger 
determinant  (CMD),  can  be  used  to  formulate  certain  geometric  relations  among  these 
distances  in  the  noiseless  case.  This  fact  can  be  exploited  in  the  noisy  case,  so  that,  as 
illustrated  in  Cao  et  al.  (2006),  the  effect  of  errors  in  noisy  distance  measurements  can  be 
reduced,  thereby  obtaining  a  better  estimate  of  the  target  position  compared  to  other 
approaches  to  using  the  noisy  distance  measurements,  see  e.g.,  Niculescu  and  Nath 
(2003),  Savvides  et  al.  (2003),  Savarese  et  al.  (2002),  Terwilliger  et  al.  (2004)  and  Sayed 
and  Tarighat  (2005). 

Flowever,  if  observations  are  made  over  sufficiently  large  distances,  the  surface  of  the 
earth  cannot  be  assumed  to  be  flat  and  the  problem  accordingly  becomes  localisation  on 
the  sphere  (earth);  as  a  result,  the  CMD  approach  to  localisation  with  noisy 
distance  measurements  cannot  be  used  without  some  modification.  Localisation  on  the 
sphere,  which  has  been  dealt  with  in  practical  applications  (Lindsay,  2006;  LORAN-C 
General  Information,  1957;  Infrasonics  Program,  2003;  World  Wide  Lightning  Location 
Network,  2002),  often  assumes  that  distance  measurements  are  great  circle  distances 
rather  than  Euclidean  distances.  For  example,  in  the  electronic  navigation  system 
LORAN-C  used  by  ships  (LORAN-C  General  Information,  1957),  the  RF  signals 
transmitted  from  chains  of  shore  stations  at  known  locations  to  ships  are  surface 
waves  propagating  on  a  carrier  frequency  of  100  KFlz  out  to  distances  of  thousands 
of  kilometres  from  shore,  and  closely  conform  to  the  earth’s  curvature;  hence,  distance 
measurements  are  assumed  to  be  great  circle  distances  in  a  relevant  study  (Schmidt, 
1972).  Another  example  comes  from  the  detection  of  anomalous  F1F  signals  by  the 
Jindalee  over-the  horizon-radar  (OTF1R)  network.  In  conjunction  with  the  OTJTR  a 
small  and  widely  spaced  network  of  cheap,  broadband  receivers  and  spectrum 
analysers  were  deployed  to  localise  the  anomalous  F1F  signals  by  cross-correlating  these 
signals  to  compute  TDOA  (Lindsay,  2006).  As  these  signals  had  travelled  in  a  series  of 
hops  within  the  narrow  (relative  to  the  earth’s  radius)  wave  guide  bounded  by  the 
ionosphere  and  the  earth’s  surface  and  the  distance  measurements  were  a  significant 
fraction  of  the  earth’s  circumference  they  could  be  approximated  as  great  circle  distances 
(Newsam,  2006). 

In  general,  localising  a  target  in  three-dimensional  space  requires  distance 
measurements  from  this  target  to  at  least  four  non-coplanar  sensors.  In  this  work,  we  shall 
attempt  to  examine  the  following  problem:  “given  three  sensors  and  one  target  on  the 
surface  of  a  sphere,  is  it  possible  to  localise  the  target  with  noisy  distance  measurements 
using  the  CMD  method;  if  so,  what  kind  of  performance  can  we  obtain  compared  to  other 
methods?”. 

For  the  purpose  of  analysis  and  reducing  complexity  in  this  paper,  we  make  the 
overall  assumption  that  “the  earth  is  a  perfect  sphere  and  all  sensors  and  target  lie  on  the 
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surface  of  the  sphere.  We  comment  briefly  near  the  end  of  the  paper  on  ways  by  which 
one  could  allow  for  the  ellipsoidal  shape  of  the  earth”. 

The  paper  is  organised  as  follows.  In  Section  2,  we  provide  some  background 
information.  We  state  the  definition  of  the  CMD  in  three-dimensional  space.  In  Section  3, 
we  look  at  how  it  is  possible  to  formulate  a  CMD  on  a  sphere  with  three  sensors  and  one 
target  in  the  noiseless  measurement  case  as  well  as  in  the  noisy  case.  In  Section  4,  we 
show  how  the  errors  in  noisy  measurements  can  be  estimated  and  subsequently  reduced 
by  solving  an  optimisation  problem;  then  we  gives  the  algorithm  to  locate  the  coordinates 
of  an  unknown  point  on  the  surface  of  the  sphere,  using  noisy  or  noiseless  distance 
measurements.  In  Section  5,  we  investigate  the  localisation  problems  under  a  planar 
assumption,  to  identify  circumstances  where  a  planar  approximation  will  not  produce  a 
large  error.  The  paper  ends  with  concluding  remarks  and  directions  for  future  work  in 
Section  6. 


2  Background  concepts 

The  basic  problem  of  distance-based  target  localisation  can  be  formally  defined  below: 
given  a  set  of  sensors  at  known  positions,  and  a  set  of  distance  measurements  from  these 
sensors  to  the  single  unknown  target,  determine  the  position  of  the  target.  The  problem 
could  have  many  variations:  for  example,  when  multiple  targets  are  present,  when 
measurements  are  noisy,  when  the  sensor  positions  are  noisy  (see  e.g.,  Yu,  2007),  etc. 

2.1  Localisation  on  the  plane  using  distance  measurements 

The  localisation  problem  on  the  plane  is  simple.  In  the  noiseless  case,  with  two  sensors, 
one  can  determine  the  position  of  a  target  up  to  binary  ambiguity,  and  with  one  extra 
sensor,  uniquely  (provided  that  these  sensors  are  not  collinear).  When  the  measurements 
are  noiseless,  a  conventional  multilateration  (trilateration  in  this  case)  method  will  solve 
the  problem.  One  can  imagine  drawing  circles  centred  at  each  sensor  with  radii  equal  to 
the  associated  distance  measurement,  and  determining  a  common  point  of  intersection.  In 
the  absence  of  sensor  collinearity,  there  is  a  single  such  point,  being  the  target  position. 

In  the  noisy  measurement  situation,  an  additional  step  has  to  be  performed  to 
compensate  the  effect  of  noise.  Various  approaches  have  been  proposed  in  Cao  et  al. 
(2006),  Niculescu  and  Nath  (2003),  Sawides  et  al.  (2003),  Savarese  et  al.  (2002), 
Terwilliger  et  al.  (2004)  and  Sayed  and  Tarighat  (2005).  Cao  et  al.  (2006)  explored  an 
approach  based  on  using  an  underlying  geometric  relationship,  expressed  using  the 
Cayley-Menger  matrix,  to  formulate  a  optimisation  problem  to  estimate  the  noises 
contained  in  each  of  the  three  sensor  to  target  distance  measurements.  The  proposed 
method  is  effective  for  small  noise  and  when  the  sensors  and/or  target  are  not  collinear  or 
close  to  being  collinear.  It  appeals  to  an  underlying  geometric  constraint  on  the  true 
distances.  It  significantly  out-performs  methods  based  on  linear  calculations,  see  e.g., 
Sayed  and  Tarighat  (2005). 

In  two  dimensions,  more  than  three  sensors  can  of  course  be  used.  The  method  of 
Cao  et  al.  (2006)  deals  with  this.  In  three  dimensions,  a  minimum  of  four  sensors  is 
required.  This  is  not  hard  to  see,  since  it  is  a  straightforward  generalisation  of  the 
two-dimensional  case. 
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2.2  The  CMD 


The  Cayley-Menger  matrix  of  /7-points  in  an  ///-dimensional  space  is  defined  as  per 
(Blumenthal  and  Gillam,  1943) 


M  = 


0 

d\,o 

d  n—2,0 

dn- 1,0 


1 


^0,1 

0 

dl  2,i 

dn- 1.1 
1 


do.n-2  d0n^  1 

d\,n- 2  d*„_  1  1 

0  dl  2,„-i  1 

dll.n-2  0  1 

1  1  0 


where  di:j  =  dtj,  i,j  =  0,  . . n  -  1,  /'  is  the  Euclidean  distance  between  the  points  p ,  and 
Pj.  The  following  is  the  key  theorem  which  we  shall  use: 

Theorem  2.1:  (Blumenthal  and  Gillam,  1943)  Consider  an  /7-tuple  of  points  po,....pn  i  in 
///-dimensional  space.  If  n  >  ///  +  2  then  the  (//  +  1)  x  (//  +  1)  Cayley-Menger  matrix 
M(p0, ...,pn  i )  has  rank  ///  +  2. 

Given  five  points  in  three-dimensional  space,  this  theorem  is  equivalent  to  requiring  a 
single  relationship  among  the  distances,  namely,  that  the  determinant  of  the  matrix  M  of 
(1)  is  zero: 

det(M(p0,Pl,p2,p3,p4))  =  0  (1) 


In  the  next  section,  we  shall  attempt  to  develop  a  variant  of  the  concept  to  use  on  the 
sphere.  The  two-dimensional  case  has  already  been  discussed  in  Cao  et  al.  (2006). 


3  Formulation  of  Cayley-Menger  constraint  on  a  sphere 

3. 1  Localisation  on  a  sphere  with  great  circle  distances 

Given  two  sensors  on  a  sphere,  and  noiseless  great  circle  distances  to  an  emitter  or  target, 
the  target  evidently  can  only  be  localised  with  binary  ambiguity.  On  the  other  hand, 
measurements  from  three  or  more  sensors  will  in  general  resolve  the  ambiguity,  provided 
that  the  sensors  are  not  all  located  on  a  common  great  circle,  i.e.,  they  are  not  coplanar 
with  the  centre  of  the  sphere.  In  the  noisy  case,  similar  remarks  will  apply  as  for  the  case 
of  localisation  in  the  plane,  and  it  is  clear  that  one  needs  a  way  of  handling  the  noise.  Our 
approach  will  be  first  to  introduce  a  Cayley-Menger  matrix  and  determinant  appropriate 
for  the  sphere,  with  an  analogue  of  Theorem  2.1  applying  to  noiseless  measurements. 
Then  we  will  show  how  to  handle  the  presence  of  noise. 

Consider  three  sensor  nodes  1,  2  and  3,  with  known  positions  p\,  p2,  p2  and  a  further 
node  0,  the  target,  with  unknown  position  p0.  All  these  four  points  lie  on  the  surface  of 
the  sphere.  Sensor  to  target  distances  on  the  surface  of  the  sphere  in  general  are  given  as 
great  circle  distances,  see  Schmidt  (1972)  and  Newsam  (2006). 

Since  a  CMD  involves  the  Euclidean  distances,  it  is  therefore  essential  to  convert 
each  great  circle  distance  into  its  corresponding  Euclidean  distance,  or  the  distance  along 
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the  chord  formed  by  the  pair  of  end  points.  Assuming  points  on  the  surface  of  the  sphere 
are  represented  by  vectors  with  the  sphere  centre  at  the  origin,  the  great  circle  distance 
between  two  points  a  and  b  with  position  vectors  a,  b  in  three-dimensional  Euclidean 
coordinates  is  obtainable  via  the  following  equations  (M’Clelland  and  Preston,  1907): 


cos#  = 


<a,b  > 

I  a  II  b  ! 


(2) 


da,b 


=  r6 


(3) 


Here,  <a,  b>  is  the  dot  product  of  the  position  vectors  of  the  two  points,  r  is  the  radius  of 
the  sphere,  9  is  the  angle  subtended  at  the  origin,  and  da  h  is  the  (accurate)  great  circle 
distance  between  points  a  and  b. 

The  Euclidean  distance  dab  can  be  found  by  any  one  of  the  following  equations: 


dab 


=  2r  sin 


=  2r  sin 


(4) 


To  handle  the  noisy  measurement  problem,  our  first  goal  is  to  derive  an  analogue  of  a 
Cayley-Menger  determinantal  condition,  which  applies  for  true  distances.  However,  as 
noted  previously,  we  would  apparently  need  five  points  in  three-dimensional  space  to  do 
this.  An  additional  point  on  the  surface  of  the  sphere  would  resolve  the  problem,  but  in 
practical  situations,  it  would  increase  the  cost  of  localisation.  Adding  a  ground  beacon 
involves  large  amounts  of  infrastructure  cost  as  well  as  maintenance  of  the  beacons, 
which  does  not  make  sense,  if  it  is  just  to  apply  the  CMD  method.  We  limit  our  solution 
to  one  with  only  three  sensors  and  propose  the  following  novel  result  which  is  a  corollary 
of  Theorem  2.1,  simple  in  retrospect,  but  perhaps  not  so  obvious  until  it  has  been  stated: 

Corollary ;  3.1:  Let  p0,  pu  p2,  and  p3  be  four  points  on  the  surface  of  a  sphere  of  radius  r, 
and  suppose  that  d^  denotes  the  (exact)  Euclidean  distance  between  points  p,  and  p,.  Then 
with  the  definition  of  the  spherical  Cayley-Menger  matrix  (SCM)  as 


SCM  = 


0 

d01 

d02 

^03 

r2 

i 

^01 

0 

4 

4 

r2 

i 

d02 

4 

0 

4 

r2 

i 
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r2 
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1 

0 

there  holds 

det(SCM)  =  0 


(5) 


With  the  four  points  p0,  pu  p2,  p2,  associate  a  fifth  point  p4,  which  is  the  centre  of  the 
sphere.  The  Euclidean  distance  from  this  fifth  point  to  each  of  the  first  four  points  is  r. 
Therefore,  we  can  form  the  Cayley-Menger  matrix  associated  with  these  five  points,  and 
it  is  (5),  and  because  all  five  points  lie  in  three-dimensional  space,  the  determinant  is  zero 
by  Theorem  2.1. 
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Remark  3.1:  The  problem  identified  prior  to  the  theorem  statement  of  finding  a  fifth  point 
is  bypassed,  by  not  requiring  the  fifth  point  to  lie  on  the  surface  of  the  sphere.  Choosing  it 
at  the  sphere  centre  gives  us  the  relevant  distances,  including  that  from  the  fifth  point  to 
the  target,  whose  position  though  unknown  is  known  to  be  on  the  surface  of  the  sphere. 
We  comment  in  the  final  section  on  what  might  be  done  when  the  sphere  is  replaced  by 
an  ellipsoid. 

Remark  3.2:  One  should  make  the  distinction  between  this  definition  of  SCM,  as  a  special 
case  of  a  three-dimensional  CM  when  four  points  are  co-spherical  and  one  point  is  the 
centre  of  that  sphere,  with  another  special  CM  matrix  given  in  Michelucci  and  Foufou 
(2004)  for  the  case  when  all  the  five  points  are  co-spherical. 

Remark  3.3:  It  can  be  easily  verified  that  the  determinant  of  SCM  becomes  the 
determinant  of  a  CM  matrix  for  a  problem  in  which  p0,  pu  p2  and  p2  are  coplanar  when  r 
goes  to  infinity,  i.e.,  ideas  of  Cao  et  al.  (2006)  are  recovered. 

3.2  Noisy  case:  SCM 

Let  dij  denote  the  accurate  Euclidean  distance  between  nodes  i  and  j  with  i, 
ye  {0,  1,  2,  3 },  i  *  j-  Suppose  0  corresponds  to  the  target,  and  nodes  1,  2  and  3  to  the 

sensors.  If  the  great  circle  distance  measurements  from  sensor  to  target,  denoted  by  d0i , 
are  corrupted  by  noise,  we  can  find  the  corresponding  noisy  squared  Euclidean  distance, 

denoted  by  d()  ,  ,  using  equation  (4).  We  can  postulate  the  existence  of  an  error  variable 
e,  relating  the  true  values  to  the  noisy  values  according  to 
- 2 

<4  =<4  ~ei  (6) 

for  ;  e  {1,  2,  3}.  We  shall  now  utilise  (6)  in  conjunction  with  the  geometric  constraint 
condition  associated  with  the  SCM. 

- 2 

Substituting  d0i  +ei  in  place  of  d0i  in  SCM  yields  a  form  for  the  SCM  in  which 
noisy  measurement  values  explicitly  appear,  as  the  three  unknowns  e1;  e2,  e3.  Call  this 
form  of  the  matrix  SCM*,  to  emphasise  the  dependence  on  the  e,-. 

By  evaluating  the  determinant  of  the  matrix  SCM*  and  setting  it  to  zero,  we  will  then 
arrive  at  an  equation  which  will  provide  a  relationship  among  the  errors  ei,  e2,  e3  and  it  is 
a  relationship  which  includes  the  measured  data.  Whatever  the  errors  are,  they  must 
satisfy  this  relationship.  The  proof  of  the  theorem  is  largely  parallel  to  that  of  Theorem  3 
of  Cao  et  al.  (2006);  however,  an  important  non-coplanarity  property  has  to  be  argued 
here. 

Theorem  3.4:  Let  p0,  pu  p2  and  p2  four  points  on  the  surface  of  a  sphere  of  radius  r, 
suppose  p\,  p2,  pi  are  not  coplanar  with  the  centre  of  the  sphere,  let  d0i  and  d0i  denotes 
the  exact  and  noisy  Euclidean  distances  between  points  p0  and  p„  and  let  e,  denote  the 
associated  error  between  the  squares  as  in  (6).  Then  the  errors  e„  i  e  {1,  2,  3}  satisfy  a 
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single  algebraic  equality  which  is  quadratic  though  not  homogeneous  in  the  e,-’s,  i.e.,  for 
some  A,  b  and  c,  there  holds 

erAe  +  erb  +  c  =  0  (7) 

where 

and  where  A,  b,  c  depend  on  known  data  and  are  given  in  the  proof  below. 

Denote  the  new  matrix  as  SCM  : 

—2  ——2  —2  2 

0  d0l  +€j  d02  +e2  d03  +e3  r  1 

0  d^2  d\3  y  1 

d§2  ~\~  62  ^12  d  ^23  1 

d§3  c3  d\3  d23  0  y  1 

2  2  2  2  n  1 

Y  Y  Y  Y  0  1 

1  1  1  110 

Then  the  determinantal  equation  yields 

det  {SCM*)  =  det  ( SCM )  =  0  (8) 

Partition  SCM  as  follows 

Z11  z12 
Z21  ^22  _ 

where  z\  \  is  zero,  z12  is  a  row  vector  and  z2i  is  a  column  vector.  If  Z22  is  non-singular, 
then 

det  ( SCM )  =  det  (Z22 )  [zn  -  z12Z22z21  ]  =  0  (9) 

Observe  that  Z22  is  actually  the  standard  Cayley-Menger  matrix  associated  with  the  three 
sensors  p\,  p2,  p2  (lying  on  the  surface  of  the  sphere)  and  the  centre  of  the  sphere.  Since 
these  four  points  are  not  co-planar  by  hypothesis,  det(Z22)  is  non-zero  by  a  converse  of 
Theorem  2.1,  see  Michelucci  and  Foufou  (2004).  This  ensures  that  Z22  exists  while  zn  is 
zero.  Hence,  from  (9) 

z12^22z21  =  0 
Given  that 

z12  =  ^22  =  [^01  ^02  ^03  r  l]  +  [el  e2  e3  0  0] 


we  obtain 
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erAe  +  erb  +  c  =  0 

where  e  =  [e1,e2,e3]  ,  with  A  being  the  top  left  3x3  block  of  Z2\, 

r — 2  — 2  — 2  -\T 
b  =  2A  [_dm  d0 2  J 

and 

C  =  [t/01  ^02  ^03  r  l]^22 
kl  ^02  ^03  r 

Remark  3.5:  In  the  event  that  four  or  more  sensors,  or  multiple  measurements  from  the 
same  set  of  three  sensors,  are  available  with  noisy  measurements  to  the  target,  one  such 
constraint  equation  can  be  found  for  each  selection  of  three.  With  N  sensors,  only  N  -  2 
of  these  constraint  equations  are  independent.  One  could  consider  constraint  equations 
using  {1,  2,  3},  {1,  2,  4 J 1 ,  2,  N]  for  example. 

Remark  3.6:  The  condition  that  p\,  p2  and  p2  be  coplanar  with  the  centre  of  the  sphere  is 
indeed  essential  for  unique  localisation,  irrespective  of  the  algorithms  used.  If  the  four 
points  were  coplanar,  there  would  be  two  positions  for  p0,  on  each  side  of  the  plane, 
consistent  with  the  distance  constraints. 


4  Localisation  on  the  sphere 

4.1  Determining  target  location 


This  subsection  explains  how  to  estimate  the  position  of  a  target  from  the  sensors’ 
positions  and  the  great  circle  distances  from  each  sensor  to  the  target.  Note  we  have 
assumed  that  the  sensor  positions  are  accurate  and  the  great  circle  distance  measurements 
may  be  noisy. 

Consider  temporarily  the  case  when  the  great  circle  distances  are  noiseless,  i.e.,  d0i 
are  used.  We  can  write  down  the  following  four  equations: 

T  ( U  'A 


*01 


*02 


*03 


=  arccos 


=  arccos 


=  arccos 


T  2 

PoPo  =r 


(10) 

(11) 

(12) 

(13) 


The  set  (10)  to  (13)  provides  four  equations  for  three  unknowns.  In  the  noiseless  case, 
there  will  exist  a  unique  solution  to  the  equations.  Now,  suppose  that  the  great  circle 
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distances  are  noisy.  If  we  simply  insert  the  noisy  distances  d0j  into  the  equations  above, 
there  will  in  general  no  longer  be  any  solution  to  the  equations,  because  they  are  an 
overdetermined  set. 

Let  us  now  indicate  an  algorithm  for  obtaining  a  solution  to  the  equations  in  the 
noiseless  case,  which  has  the  property  that  if  noisy  measurements  replace  noiseless  ones 
in  the  algorithm,  the  algorithm  can  still  be  executed  and  it  will  yield  a  target  position 
estimate  (though  not  of  course  one  which  satisfies  (10)  through  (13)  simultaneously, 
which  will  be  impossible).  In  the  next  subsection,  we  will  indicate  an  improvement  to  the 
algorithm  for  the  noisy  case. 

The  algorithm  is  motivated  by  what  has  been  suggested  for  planar  localisation  with 
three  noisy  distance  measurements  (Sayed  and  Tarighat,  2005): 

1  taking  the  cosine  ofboth  sides  of  equations  (10)  and  subtracting  the  transformed  (10) 
from  the  transformed  (12),  we  can  obtain  (14) 

2  similarly,  we  can  obtain  (15)  from  (11)  and  (12) 

3  these  two  equations  are  then  combined  with  (13)  and  the  resulting  three  equations  are 
solved  for  the  three  unknowns. 


(  A 

( 'T'  \ 

“01 

-  COS  | 

“03 

V  r  J 

\  r  ) 

(  <^\ 

(  ^02 

I- cos 

“03 

— 

V  r  J  \  r  J 


T  2 

iPoPo  =  r 


(Po'Pi)  (Po-Pa) 
|Po||Pl|  |Po  |  |P3  | 
(Po-Pz)  (Po-Ps) 

poIIpzI  IpoIIps 


(14) 

(15) 

(16) 


In  the  noiseless  case,  the  above  method  must  deliver  a  correct  answer  for  p0  due  to 
geometric  consistency.  This  motivates  us  to  utilise  the  same  consistency  requirement 
embedded  in  the  Cayley-Menger  determinantal  condition  to  handle  the  noisy 
measurements:  “one  replaces  the  direct  measured  noisy  great  circle  distances  by  a  set  of 
estimated  great  circle  distances  that  have  geometric  consistency  (which  is  enforced  by  the 
use  of  a  CMD  constraint)  to  obtain  a  target  estimate”.  We  can  use  the  optimisation 
method  outlined  in  the  next  subsection  to  obtain  estimated  great  circle  distances. 


4.2  Optimisation  and  error  reduction 

In  this  subsection,  we  will  see  how  the  errors  in  the  noisy  measurements  can  be 
estimated,  subsequently  leading  to  estimates  of  the  Euclidean  distances  between  the 
sensors  and  the  target,  which  are  consistent  with  the  geometrical  constraint  embodied  in 
the  CMD  being  zero.  The  estimates  then  allow  estimation  of  the  target  position. 

The  analysis  is  analogous  to  that  in  Cao  et  al.  (2006).  Let  e,  as  defined  in  (6)  be  the 
error  in  the  estimated  squared  distances  between  the  target  and  sensor  i.  We  aim  to 
minimise: 


J -e\  +e~2  +e3 


(17) 


338 


C.  Yu  et  al. 


to  the  quadratic  equality  constraint  (7).  This  is  actually  a  reasonably  standard  problem  of 
numerical  analysis.  If  there  happen  to  be  more  constraints,  on  account  of  having  more 
sensors,  the  problem  is  less  standard,  but  nevertheless  well  posed. 

For  the  single  constraint  case  and  using  the  Lagrangian  multiplier  method,  we  obtain 
the  following  objective  function  H: 

H  (eUe2’el’\)  =  el  +e2  +e3  +^/(el’e2’e3)  (18) 

where y(ei,  e2,  e3)  has  the  quadratic  form  (7). 

By  differentiating  the  objective  function  H  with  respect  to  eh  e2,  e3,  and  \,  and  setting 
the  result  to  zero  we  can  obtain  four  equations.  One  of  these  is  (7). 

On  solving  these  four  equations  numerically,  we  will,  often,  end  up  with  multiple 
solutions,  with  some  sometimes  being  complex  numbers.  Hence,  we  need  to  eliminate 
any  complex  solution  or  non-optimal  real  stationary  point  solution,  i.e.,  solutions 
corresponding  to  other  than  the  global  minimum.  The  global  minimum  must  be  one  of  the 
solutions.  The  solution  for  the  least  squares  problem  is  then  the  set  of  e,-’s  which  satisfy 
the  condition  set  out  below: 

mine!  +  €2  ^"^3 
s.t.  f(el,e2,e3)  =  0 

As  we  now  have  the  values  for  eh  e2,  e3,  we  then  obtain  the  estimated  Euclidean  distances 
and  subsequently  convert  them  into  the  estimated  great  circle  distance  for  each  of  the 
sensor  to  target  pairs.  The  Euclidean  distances  are  consistent  with  the  Cayley-Menger 
condition,  in  that  if  substituted  into  the  determinant,  will  result  in  the  determinant  being 
zero.  This  also  enforces  the  geometric  consistency  of  estimated  great  circle  distances,  i.e., 
the  four  equations  (10)  to  (13)  will  now  have  a  solution. 

In  effect,  in  this  subsection  we  have  almost  described  how  to  compute  a  maximum 
likelihood  estimate  of  the  target.  We  have  done  this  by  computing  a  maximum  likelihood 
estimate  of  the  errors  associated  with  the  squares  of  the  Euclidean  distances  between  the 
sensors  and  the  target  which  is  consistent  with  the  inherent  geometrical  constraint  that 
links  these  errors.  The  entire  derivation  is  very  reminiscent  of  the  two-dimensional 
result  of  Cao  et  al.  (2006).  The  argument  that  Z22  is  non-singular  is  peculiar  to  this 
problem. 

4.3  Computational  examples 

In  this  subsection,  we  will  give  two  related  computational  examples  to  demonstrate  the 
steps  introduced  in  the  previous  sections.  In  the  first  example,  we  directly  use  the 
noiseless  great  circle  distances  from  each  sensor  to  the  target.  While  in  the  second  case 
(noisy),  we  introduce  errors  into  the  great  circle  distance  measurements. 

We  first  describe  the  setup  of  both  examples,  which  deal  with  the  same  geometric 
arrangement  of  sensors  and  target 

•  radius  of  sphere  in  the  two  cases  is  set  to  unity 

•  actual  target  position  p0  =  [0.41 18,  0.7180,  0.5612],  denoted  by  the  diamond  in 
Figure  1 
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•  nodes  1,  2,  3  are  designated  as  the  sensors,  with  known  position  vectors,  denoted  by 
the  three -round  dots  in  Figure  1: 

1  sensor  position  p!  =  [0.0802,  0.3574,  0.9305] 

2  sensor  position  p2  =  [0.2661,  0.8731,  0.4085] 

3  sensor  position  p3  =  [0.1698,  0.9844,  0.0458] 

4  the  origin  serves  as  a  pseudo  sensor  p4 

5  for  the  ten  distances  that  are  required  in  the  three-dimensional  Cayley-Menger 
matrix,  only  the  three  great  circle  distance  measurements  between  target  and 

sensors,  namely  [t/0 , ,  c/02 ,  £/03  ]  will  change,  as  a  result  of  varying  the  target 
position  p0. 

Figure  1  Noisy  case:  round  dots  denote  sensors,  diamond  denotes  actual  target,  triangle  denotes 
optimised  target  estimate  based  on  using  SCM-optimised  distance  estimates,  square 
denotes  a  target  estimate  based  on  using  noisy  distances  directly 

Localization  on  Sphere  using  Cayley  Menger  approach 


In  the  first  example,  we  shall  show  the  performance  of  the  CMD  method  in  a  noiseless 
situation  by  using  the  matrix  SCM  of  Section  3.  The  noiseless  great  circle  distances 
measurements  between  the  actual  target  and  sensor  positions  are: 

d0l  =0.6236,  d02  =  0.2627,  d03  =0.6395 

By  converting  the  values  of  d0j  into  their  corresponding  Euclidean  distances  d0i  using 
(4),  we  can  then  substitute  the  corresponding  Euclidean  distances  into  the  matrix  SCM. 
Evaluating  the  determinant  of  SCM  results  in  a  value  of  0  (as  expected). 
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This  indicates  that  the  Euclidean  distances  are  consistent  with  the  set  of  points 
(Po,  P\,P2,P3,  pi)  in  three-dimensional  space  and  hence  there  is  no  correction  needed.  As 
independent  verification  of  this,  we  note  that  solutions  for  the  errors  obtained  from 
MATLAB  simulation  using  the  algorithm  in  Subsection  4.2  are  as  follows: 

e*  =0.000,  e*2  =  0.000,  =  0.000 

Using  these  accurate  great  circle  distances  and  following  the  method  in  Subsection  4.1, 
we  obtain  the  estimated  target  position  pj,  =  [0.4118,0.7180,0.5612],  which  is  the  same 
as  the  true  target  position  as  expected. 

Let  us  now  consider  the  case  where  the  measurements  are  noisy.  All  nodes  and 
parameters  in  this  case  remain  the  same  as  those  in  the  first  case  apart  from  the  three 
great  circle  distance  measurements  from  sensors  to  the  target.  Suppose  the  three  great 
circle  distance  measurements  are  corrupted  by  noises  within  [-4%,  4%]  of  the  actual 
distances  as  following 

=  0.5986 
d0 2  =  0.2575 
d0 3  =  0.6587 

By  converting  each  great  circle  distance  d0i  into  its  corresponding  Euclidean  distance 
d0i,  we  can  then  substitute  the  values  of  d0i  into  the  matrix  SCM*.  By  evaluating  the 
determinant  of  the  matrix  SCM*,  we  obtain  one  quadratic  equality  constraint,  defined  by 
(7).  Subsequently,  we  followed  the  procedures  as  mentioned  in  Subsection  4.2  and 
obtained  the  following  solution  to  the  constrained  least  squares  problem: 
ej  =  -1.235xl0~3,  c2  =  7.886x10  e,  =  -4.216xl0~3. 

Following  the  method  in  Subsection  4.1,  we  can  solve  for  the  optimised  target 
estimate  and  we  obtain  the  position  [0.3973,  0.7099,  0.5816]. 

For  a  direct  comparison  with  a  non-optimised  estimate,  the  noisy  great  circle  distance 
measurements  d01,  d02  and  d02  are  used  directly  in  (10)  to  (13)  and  we  obtain  a  target 
estimate  at  position  [0.4804,  0.6713,  0.5643]  whose  error  is  clearly  substantial. 

As  depicted  in  Figure  1,  for  this  example  which  is  reasonably  generic,  the  CMD 
method  of  estimating  errors  results  in  a  better  estimation  of  the  unknown  target  location 
on  the  surface  of  the  sphere  as  compared  to  just  using  the  noisy  measurements  for 
localisation  without  utilising  the  geometric  constraints. 


5  Localisation  under  a  planar  assumption 

As  we  have  mentioned  in  Section  2,  if  distances  between  the  nodes  including  sensors  and 
a  target  are  small,  the  surface  of  the  earth  involving  these  nodes  is  almost  flat  so  that  the 
two-dimensional  CMD  method  will  work  approximatively  under  a  planar  assumption. 
Flowever,  besides  errors  from  noisy  measurements,  new  errors  are  induced  due  to  the 
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planar  approximation  arising  from  the  planar  assumption.  With  the  scale  of  the  distances 
rising,  the  errors  from  the  planar  approximation  will  increase  and  eventually  become 
unacceptable.  Therein,  the  planar  assumption  does  not  hold  and  we  can  apply  the 
spherical  CMD  method  instead  of  the  two-dimensional  CMD  method.  In  this  section,  we 
shall  investigate  the  circumstances  where  the  two-dimensional  CMD  method  can  be 
employed  under  a  planar  assumption. 

5. 1  Problem  model 

Figure  2  illustrates  the  target  localisation  problem  solved  by  both  the  two-dimensional 
CMD  method  and  the  spherical  CMD  method: 

•  point  p0  represents  the  target  to  be  localised  and  points  p  \ ,  p2,  p2  represent  the  three 
sensors  with  known  positions,  all  which  lie  on  the  surface  of  a  sphere  corresponding 
to  the  earth 

•  points  pi  and  pi  denote  estimated  positions  of  the  target  p0  by  using  the  spherical 
CMD  method  and  the  two-dimensional  Cayley-Menger  under  a  planar  assumption 
respectively 

•  point  pq  lies  on  the  surface  of  the  sphere  and  point  p l  lies  in  the  plane,  named  A, 
which  is  determined  by  points  pu  p2  and  p3 

•  points  lying  both  in  A  and  inside  the  sphere  form  a  disk,  named  C.  Note  that  pi  may 
be  inside,  on,  or  outside  the  sphere,  and  thus  inside  or  outside  C 

•  point  p^  is  the  intersection  between  the  surface  of  the  sphere  and  the  ray  starting 
from  the  centre  of  the  sphere,  denoted  O,  and  going  through  point  pi ,  and  it  is  the 
projection  of  point  pi  onto  the  surface  of  the  sphere. 

A  Cartesian  coordinate  system  is  established  in  Figure  2  with  the  centre  O  of  the  sphere 
as  the  origin,  Z-axis  perpendicular  to  A  and  arbitrary  orthogonal  X-  and  7-axes  in  a  plane 
parallel  to  A.  Designate  p  as  the  position  vector  of  point  p  in  the  coordinate  system.  We 
further  define  the  following: 

®  d  —  max  1  d o l ^  d ?  ^03 *  d\2*  d\^ *  ^^3/ 

•  =min \dn,  dn,  d22t)  and  d =max{<i12,  d]3,  d23} 

•  let  &  be  the  great  circle  distance  between  points  p{)  and  p()  and  crbe  the  Euclidean 
distance  between  them,  i.e.,  a  =  |p,j  -pj| 

•  /  =  |Po| 

•  rp  is  the  radius  of  the  disk  C 

•  h  is  the  distance  from  the  origin  O  to  A. 
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Figure  2  A  localisation  scenario 


Notes:  The  solid  circle  denotes  a  great  circle  of  the  sphere.  The  dashed  ellipse  denotes  the 
boundary  of  the  disk  C.  &  is  the  great  circle  distance  between  points  p*0  and  pi 
and  a  is  the  Euclidean  distance  between  pi  and  pi. 


Recall  that  the  spherical  localisation  procedure  cannot  be  applied  if  the  three  sensors  are 
on  a  common  great  circle.  Therefore,  we  impose  the  practical  restriction  that  they  cannot 
lie  in  a  plane  that  is  closer  than  ar  to  the  centre  of  the  sphere,  i.e., 

—  >a  (19) 

r 


and  acceptable  values  for  a  will  be  indicated  below. 

As  depicted  in  Figure  3,  with  the  least  internal  angle  Z.p2p\ p 3  of  the  triangle  formed 
by  the  three  sensors  goes  to  0,  Z/>2  Op  p2  goes  to  0  at  the  same  time  and  points  p2  and  p2 
tend  to  overlap;  consequently,  the  three  sensors  tend  to  be  collinear,  with  the  result 
that  the  2-dimensional  localisation  procedure  does  not  work.  Since  the  angle  YLp2Op  p2 


approximately  equals  to 


'4™ 


,  we  impose  another 


larger  than  a  small  constant  fi  i.e., 


practical  restriction  that  Z.p2  Op  p2  is 


d4* 


>  P 


(20) 


and  again,  acceptable  values  for  /?  will  be  indicated  below. 


As 


approaches  0,  the  three  sensors  tend  to  be  collinear  and  therefore  also  to  be 


coplanar  with  the  origin  O,  which  means  —  falls  to  0  as  well.  Thus,  the  values  of  a  and  /? 

r 


are  not  independent.  In  addition,  when  d ,  and  r„  simultaneously  approach  0,  it  is 

•'’min  " 

possible  for  the  least  internal  angle  to  be  still  larger  than  J3  while  h  approaches  r  (certainly 
larger  than  ar),  but  the  three  sensors  gradually  concentrate  at  one  point,  which  results  in 
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both  collinearity  and  coplanarity  of  the  three  sensors  and  the  centre  of  the  sphere.  To 
avoid  the  exceptional  case,  we  restrict  the  minimal  inter-sensor  distance  dA  to  be 

larger  than  a  small  positive  constant  p,  i.e., 

dAmm  >  P  (21) 

which  can  be  easily  fulfilled.  For  instance,  p  =  0.001  km  is  reasonable  in  a  real  system 
but  sufficient  for  the  constraint. 

Figure  3  Nearly  collinear  sensors 


Note:  The  circle  corresponds  to  C  with  Op  as  its  centre  and  radius  rp. 

Essentially,  both  a  and  f)  are  determined  by  the  geometric  layout  of  three  sensors 
involved  in  a  localisation  problem,  and  together  with  p  describe  how  close  the 
localisation  problem  is  to  the  unacceptable  situations,  i.e.,  the  three  sensors  being 
collinear  and  the  three  sensors  being  coplanar  with  the  centre  of  the  sphere.  Both  of  them 
are  suggested  by  the  simulation  evidence  of  Section  5.3  as  being  separately  necessary 
lower  bounds. 

5.2  The  error  from  a  planar  approximation 

Measuring  the  difference  between  localisation  results  for  one  target  with  and  without  a 
planar  assumption,  namely  the  Euclidean  distance  between  the  two  position  estimates 

|pq  —  Pq  | ,  defines  a  metric  for  the  error  from  the  planar  approximation.  In  Figure  2,  we 
notice  that  the  great  circle  distance  between  p*Q  and  pf  i.e.,  d,  and  the  Euclidean 

distance  between  pi  and  pf  i.e.,  |pQ-pj|,  are  like  two  components  of  |pg -p|j|  along 
two  different  directions:  ‘angular’  and  ‘radial’.  Moreover,  the  great  circle  distance  d 
reflects  the  aspect  of  the  error  that  directly  affects  prospective  usage  of  the  estimated 
position  under  the  planar  approximation.  The  Euclidean  distance  |po-Po|  reflects  the 

aspect  of  the  error  from  the  non-planar  characteristic  of  the  spherical  surface,  and  it  can 
easily  be  compensated  for  by  a  projection  step  following  the  two-dimensional 
localisation.  Thus,  the  angular  error  is  of  great  importance. 

Because  the  Euclidean  distance  a  between  p*Q  and  p()  approaches  the  corresponding 
great  circle  distance  d  when  d  is  small,  we  use  eras  an  approximation  to  d.  When  pi 
lies  inside  the  sphere  as  illustrated  in  Figure  2,  |p|J  -p(j|  =  r- 1.  But  there  are  two  other 
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cases  according  to  the  position  of  pg.  One  is  that  pg  lies  on  the  sphere,  where  r  =  1  and 
|po  — Po  |  =  other  is  that  pg  lies  outside  the  sphere,  where  |po  ~P«|  =  l-r.  Thus, 

we  conclude  that  |po  -  Po  |  =|  r  - 1 1 .  As  such,  we  can  use  <r  and  \r—  1 1  to  denote  the  errors 

from  the  angular  and  the  radial  directions  respectively.  Moreover,  because  d  indicates  the 
scale  of  distances  involved  in  the  localisation  problem,  it  is  appropriate  for  it  to  be  used 
to  normalise  the  errors,  i.e.,  dividing  the  absolute  distances  by  it.  Therefore,  instead  of 

using  the  metric  p0  -Po|  for  the  error  introduced  by  the  planar  approximation,  we  define 
two  sub-errors ,  i.e.,  the  angular  error  and  the  radial  error,  as  follows 


e 


a 


(7 

7 


(22) 


e,- 


\r~l  I 
d 


(23) 


As  in  Huang  et  al.  (2008),  the  notation  0{- )  is  used  to  describe  orders  of  magnitude  of 
some  quantities.  Suppose  in  particular  that/is  a  function  of  variable  x.  For  some  interval 
I  of  1Z,  typically  including  0  or  oo,  /  =  0(x)  means  for  some  constant  k,  there  holds 
|/[  <  k\x\  for  all  x  in  X.  We  can  also  extend  the  definition  to  treat  powers  of  x.  In  Huang  et 
al.  (2008),  we  obtain  orders  of  magnitude  of  sub-errors  as  follows: 


e 


a 


(24) 


which  show  that  er  is  roughly  proportional  to  d  and  the  power  in  the  order  of  ea  is  cubic 
rather  than  linear.  Since  there  is  always  a  possibility  of  compensating  for  er  through  a 
projection  operation,  we  are  more  concerned  with  ea. 


5.3  Simulations 

The  above  conclusions  express  errors  in  terms  of  orders  of  magnitude  of  certain 
quantities.  In  this  section,  we  provide  simulation  evidence  for  the  analytical  results  which 
also  allows  us  to  make  more  precise  statements  about  the  levels  of  error.  These 
simulations  treat  a  large  number  of  different  localisation  problems  involving  three 
sensors  and  one  target,  with  MATLAB  used  to  determine  localisation  solutions.  In  all 
instances,  the  same  problem  is  solved  by  using  both  the  two-dimensional  and  spherical 
CMD  methods.  Further  details  of  the  simulations  are  as  follows: 

•  r  is  assigned  to  be  6,371.3  km,  which  is  the  average  radius  of  the  earth 

•  the  position  of  every  node,  namely  three  sensors  and  one  target  in  each  instance,  is 
random,  being  obtained  by  generating  three  spherical  coordinates,  a  constant  radial 
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distance  r,  a  random  zenith  angle  and  a  random  azimuthal  angle  (if  the  values  of 
parameters  d,  a,  P  and  p  are  required  to  fulfil  certain  constraints,  the  positions  are 
regenerated  until  the  constraints  are  fulfilled) 

•  errors  in  distance  measurements  are  independent  and  with  different  noise  levels,  see 
below. 

At  first,  we  conduct  simulations  with  zero  noise  to  probe  the  effects  of  a  and  P  on  the 
error  and  plot  the  sub-errors  ea  and  e,  in  Figure  4.  As  can  be  seen  from  Figure  4(a)  and 

h 

4(b),  it  is  evident  that  different  scales  of  —  result  in  different  levels  of  the  sub-errors 

r 

h  h 

except  for  —>0.995:  when  —  <0.1,  both  sub-errors  are  nearly  as  large  as  100%  no 
r  r 

h 

matter  what  d  is;  but  when  —  >  0.995,  they  are  small  and  increase  with  d.  The 

r 

Figure  4(c)  and  4(d)  show  that  the  instances  with  large  sub-errors  (corresponding  to  the 

d  4  d  4 

marks  on  top  of  the  figures)  generally  have  small  values  of  .  When  >  0.1, 

r  r 

p  p 

both  sub-errors  are  relatively  small  and  increase  with  d. 

h 

The  above  simulations  show  that  for  the  same  value  of  d,  small  values  of  —  and 

r 


normally  lead  to  large  sub-errors.  The  principal  reason  is  that  a  small  value  of 


iA*-  h  ■ 


or  —  implies  that  the  three  sensors  tend  to  be  either  collinear  or  coplanar  with  the 


pseudo-sensor,  which  probably  causes  a  large  error  in  the  estimated  position,  and  thus  the 
difference  between  the  estimated  positions  by  using  the  two-dimensional  and  spherical 
CMD  methods  probably  rises  as  well.  Therefore,  based  on  the  simulations,  we  suggest 
that  the  parameters  a  and  P  should  be  no  less  than  0.995  and  0.1  respectively  to  derive 
acceptable  results  when  applying  a  planar  approximation. 

The  case  of  a  =  0.995  indicates  rp  <  636  km,  and  because  d 4  denotes  the  maximal 

inter-sensor  distance  which  is  less  than  the  diameter  of  C,  we  have  d  4  <  1,272  km.  On 

^rnax 

the  other  hand,  the  case  of  P=  0.1  indicates  that  the  least  angle  formed  by  point  Op  and 

h 

any  two  of  the  three  points  pu  p2,  Pz  is  larger  than  2.86°.  In  addition,  when  —  >  0.995 

r 


and 


>0.1,  both  sub-errors  appear  to  increase  with  d  and  their  overall  growing 


trends  are  consistent  with  the  curves  corresponding  to  —  and  —  respectively,  which 

r  r 

also  verifies  equations  (24)  and  (25). 
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Figure  4  Simulations  with  zero  noise,  in  (a)  and  (b),  sub-errors  ea  and  e,.  with  different  scales  of 

h  .  h  h 

—  are  displayed  in  different  shapes:  diamond,  0  <  —  <0.1;  square,  0.1  <  —  <  0.8; 
r  r  r 

pentagon,  0.8  <  —  <0.9;  triangle,  0.9  <  —  <0.995;  circle,  0.995  <  — <1,  in  (c)  and  (d), 
r  r  r 

d a 

sub-errors  ea  and  e  with  different  scales  of  are  displayed  in  different  shapes: 

rP 

diamond,  0<  Amin  <0.0001;  square,  0.0001  <  Amin  <0.001;  pentagon, 
r  r 

p  p 

0.001  <—^<0.01;  triangle,  0.01<-^<0.1;  circle,  0.1<-^<V3 
r  r  r 

p  p  p 

(see  online  version  for  colours) 


Note:  The  vertical  axes  are  in  logarithm  scale. 
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Secondly,  we  simulate  the  same  localisation  problems  for  zero  noise  and  two  different 
noise  levels:  10%  and  30%  ( x%  means  that  the  percentage  of  the  distance  measurement 
error  is  uniformly  distributed  within  —x%  and  x%)  and  the  parameters  a,  fi  and  p  are 
assigned  to  be  0.995,  0.1  and  0.001  km  respectively.  The  resultant  sub-errors  ea  and  er  are 
plotted  in  Figure  5.  The  overall  growing  trends  of  both  sub-errors  are  still  near  the  solid 

^3 

curves  corresponding  to  —  and  — .  Moreover,  both  sub-errors  increase  slightly  with 
r  r 

noise  rising  from  zero  to  10%  but  increase  apparently  with  noise  rising  from  10%  to  30%. 


Figure  5  Simulation  for  different  levels  of  noise  when  a=  0.995,  fi—  0.1  andp=0.001  km, 
(a),  (c)  and  (e)  show  the  sub-error  ea  (b),  (d)  and  (f)  show  the  sub-error  er  (see  online 
version  for  colours) 


10 


10 


■d/r 


1000 
d  (km) 


2000 


(d) 


Notes:  Distance  measurements  in  (a)  and  (b)  are  noiseless;  distance  measurements  in 

(c)  and  (d)  are  with  10%  noise;  distance  measurements  in  (e)  and  (f)  are  with  30% 
noise.  Each  dot  represents  a  sub-error  in  one  localisation  instance.  The  vertical 
axes  use  a  logarithm  scale. 
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Figure  5  Simulation  for  different  levels  of  noise  when  a=  0.995,  /?=  0.1  80(1/7=0.001  km, 

(a),  (c)  and  (e)  show  the  sub-error  e„  (b),  (d)  and  (f)  show  the  sub-error  er  (continued) 
(see  online  version  for  colours) 


Notes:  Distance  measurements  in  (a)  and  (b)  are  noiseless;  distance  measurements  in 

(c)  and  (d)  are  with  10%  noise;  distance  measurements  in  (e)  and  (f)  are  with  30% 
noise.  Each  dot  represents  a  sub-error  in  one  localisation  instance.  The  vertical 
axes  use  a  logarithm  scale. 

By  the  method  of  trial  and  error,  we  can  derive  three  approximate  upper  bounds  for  the 

,  .  ,  ,  d3  d2  d3  d2  d3  d 2 

three  noise  levels  on  en  : — -  +  — — -  +  — -  and  — -  +  — -,  corresponding  to  the 
5  r3  5r2  5r3  4r2  5  r3  2  r1 

dashed  curves  in  Figure  4.  Table  1  lists  some  values  of  these  upper  bounds  of  ea.  Since 
we  are  more  concerned  with  ea,  the  data  in  Table  1  give  us  great  confidence  in  applying 
the  planar  approximation  due  to  their  small  magnitude. 


Table  1  Upper  bounds  on  e„  with  zero,  10%  and  30%  noise 


d(krn) 

10 

600 

1,000 

2,000 

Zero 

4.93e  -  5% 

0.19% 

0.57% 

2.59% 

10% 

6.17e  -  5% 

0.24% 

0.69% 

3.08% 

30% 

1 ,23e  -  4% 

0.46% 

1.31% 

5.55% 

Whether  a  planar  approximation  is  acceptable  is  decided  by  multiple  factors,  including 
accuracy  requirements  on  estimated  positions,  noise  levels  of  distance  measurements  and 
the  characteristics  of  the  localisation  problems  described  by  parameters,  such  as  d,  a ,  /(, 
and  p.  Given  certain  parameters,  we  can  predict  the  upper  bounds  on  both  sub-errors  ea 
and  er,  compare  the  upper  bounds  with  the  accuracy  requirements  on  position  estimates 
and  uncertainties  in  distance  measurements,  and  then  decide  whether  the  planar 
approximation  is  acceptable.  For  example,  provided  that  in  a  sensor  network  the 
parameters  a,  ft,  and  p  are  the  same  as  we  have  assigned  in  simulations  corresponding  to 
Figure  5  and  the  noise  level  is  10%,  ea  will  be  trivial  (less  than  0.69%)  as  long  as  d  is  less 
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than  1,000  km.  Otherwise,  if  sensors  are  equipped  with  exact  distance  measuring  devices 
and  accordingly  high  accuracies  on  positions  are  required,  or  an  extremely  small  value  of 
one  parameter  is  admitted,  we  should  be  more  cautious  in  accepting  the  planar 
approximation. 


6  Concluding  remarks 

In  this  paper,  we  have  dealt  with  the  problem  of  localisation  on  earth  using  distance 
measurements,  when  the  planar  assumption  becomes  invalid.  We  use  the  geometrical 
constraints  for  compensation  of  the  effect  of  noisy  measurements,  by  formulation  of  a 
SCM.  Although  three-dimensional  ideas  are  being  used  (for  which  normally  four  sensors 
would  be  expected),  localisation  can  be  achieved  with  but  three  sensors,  as  for  the  case  of 
planar  localisation.  This  simple  yet  effective  idea  is  verified  using  simulation  examples. 
The  numerical  range  of  validity  of  a  planar  assumption  for  practical  problems  of 
localisation  on  the  earth’s  surface  is  also  studied.  Clearly  for  small  enough  distances,  a 
planar  approximation  will  be  satisfactory. 

A  number  of  issues  remain  to  be  addressed.  The  SCM  is  expressed  using  Euclidean 
distances  for  simplicity  and  following  convention.  An  expression  using  great  circle 
distances  (and  trigonometric  functions)  is  an  easy  extension.  A  different  variation  is 
obtained  by  adding  altitude  measures  into  the  formulation;  the  problem  can  be  easily 
extended  to  the  case  when  the  sensors  and  the  target  are  at  different  heights,  though  some 
a  priori  estimate  of  target  height  would  be  required,  to  be  able  to  record  a  Euclidean 
distance  from  the  centre  of  the  earth.  Note  that  ellipsoidal  models  are  in  fact  available  for 
certain  large  areas  of  the  earth,  e.g.,  of  the  size  of  Australia;  one  could  imagine  an 
iterative  localisation  scheme,  in  which  a  height  value  was  assumed  based  on  the  current 
position  estimate,  and  then  a  new  position  estimate  would  be  obtained. 

A  projection  method  might  be  used  to  localise  points  on  sphere.  Sensor  to  target 
distance  measurements  could  be  projected  onto  the  plane  formed  by  the  three  sensors, 
and  then  a  planar  model  can  be  used.  However,  the  complexity  and  the  effectiveness  of 
this  approach  are  yet  to  be  determined. 
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Abstract — Despite  intensive  research  in  the  area  of  network 
connectivity,  there  is  an  important  category  of  problems  that 
remain  unsolved:  how  to  measure  the  quality  of  connectivity 
of  a  wireless  multi-hop  network  which  has  a  realistic  number 
of  nodes,  not  necessarily  large  enough  to  warrant  the  use  of 
asymptotic  analysis,  and  has  unreliable  connections,  reflecting  the 
inherent  unreliable  characteristics  of  wireless  communications? 
The  quality  of  connectivity  measures  how  easily  and  reliably  a 
packet  sent  by  a  node  can  reach  another  node.  It  complements  the 
use  of  capacity  to  measure  the  quality  of  a  network  in  saturated 
traffic  scenarios  and  provides  a  native  measure  of  the  quality 
of  (end-to-end)  network  connections.  In  this  paper,  we  explore 
the  use  of  probabilistic  connectivity  matrix  as  a  tool  to  measure 
the  quality  of  network  connectivity.  Some  interesting  properties 
of  the  probabilistic  connectivity  matrix  and  their  connections  to 
the  quality  of  connectivity  are  demonstrated.  We  show  that  the 
largest  eigenvalue  of  the  probabilistic  connectivity  matrix  can 
serve  as  a  good  measure  of  the  quality  of  network  connectivity. 

Index  Terms — Connectivity,  network  quality,  probabilistic  con¬ 
nectivity  matrix 

I.  Introduction 

Connectivity  is  one  of  the  most  fundamental  properties  of 
wireless  multi-hop  networks  [1]— [3],  and  is  a  prerequisite  for 
providing  many  network  functions.  A  network  is  said  to  be 
connected  if  and  only  if  (iff)  there  is  a  (multi-hop)  path 
between  any  pair  of  nodes.  Further,  a  network  is  said  to  be  k- 
connected  iff  there  are  k  mutually  independent  paths  between 
any  pair  of  nodes  that  do  not  share  any  node  in  common 
except  the  starting  and  the  ending  nodes,  ^-connectivity  is 
often  required  for  robust  operations  of  the  network. 

There  are  two  general  approaches  to  studying  the  connec¬ 
tivity  problem.  The  first,  spearheaded  by  the  seminal  work 
of  Penrose  [3]  and  Gupta  and  Kumar  [1],  is  based  on  an 
asymptotic  analysis  of  large-scale  random  networks,  which 
considers  a  network  of  n  nodes  that  are  i.i.d.  on  an  area  with 
an  underlying  uniform  distribution.  A  pair  of  nodes  are  directly 
connected  iff  their  Euclidean  distance  is  smaller  than  or  equal 
to  a  given  threshold  r  (n),  independent  of  other  connections. 
Some  interesting  results  are  obtained  on  the  value  of  r  (n) 
required  for  the  above  network  to  be  asymptotically  almost 
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surely  connected  as  n  — >  oo.  In  [4],  [5],  the  authors  extended 
the  above  results  from  the  unit  disk  model  to  a  random 
connection  model,  in  which  any  pair  of  nodes  separated  by  a 
displacement  x  are  directly  connected  with  probability  g  ( x ), 
independent  of  other  connections.  We  refer  readers  to  [7]  for 
a  more  comprehensive  review  of  related  work. 

The  second  approach  is  based  on  a  deterministic  setting  and 
studies  the  connectivity  and  other  topological  properties  of  a 
network  using  algebraic  graph  theory.  Specifically,  consider  a 
network  with  a  set  of  n  nodes.  Its  property  can  be  studied 
using  its  underlying  graph  G  (U,  E),  where  V  =  {iq, . . . ,  vn} 
denotes  the  vertex  set  and  E  denotes  the  edge  set.  The 
underlying  graph  is  obtained  by  representing  each  node  in 
the  network  uniquely  using  a  vertex  and  the  converse.  An 
undirected  edge  exists  between  two  vertices  iff  there  is  a 
direct  connection  (or  link)  between  the  associated  nodes1. 
Define  an  adjacency  matrix  Aq  of  the  graph  G  (U,  E)  to 
be  a  symmetric  n  x  n  matrix  whose  {i,j)th  ,i  ^  j  entry 
is  equal  to  one  if  there  is  an  edge  between  Vi  and  v3  and 
is  equal  to  zero  otherwise.  Further,  the  diagonal  entries  of 
A(;  are  all  equal  to  zero.  The  eigenvalues  of  the  graph 
G  (U,  E)  are  defined  to  be  the  eigenvalues  of  ,4  q  .  The  network 
connectivity  information,  e.g.  connectivity  and  fc-connectivity, 
is  entirely  contained  in  its  adjacency  matrix.  Many  interesting 
connectivity  and  topological  properties  of  the  network  can 
be  obtained  by  investigating  the  eigenvalues  of  its  underlying 
graph.  For  example,  let  pi  >  .. .  >  pn  be  the  eigenvalues  of 
a  graph  G.  If  p\  =  /i2,  then  G  is  disconnected.  If  p i  =  —  pn 
and  G  is  not  empty,  then  at  least  one  connected  component 
of  G  is  nonempty  and  bipartite.  If  the  number  of  distinct 
eigenvalues  of  G  is  r,  then  G  has  a  diameter  of  at  most  i —  1 
[8].  Some  researchers  have  also  studied  the  properties  of  the 
underlying  graph  using  its  Laplacian  matrix  [9],  where  the 
Laplacian  matrix  of  a  graph  G  is  defined  as  Lq  =  D  -  Aq 
and  D  is  a  diagonal  matrix  with  degrees  of  vertices  in  G  on 
the  diagonal.  Particularly,  the  algebraic  connectivity  of  a  graph 
G  is  the  second-smallest  eigenvalue  of  Lq  and  it  is  greater 
than  0  iff  G  is  a  connected  graph.  The  algebraic  connectivity 
quantifies  the  speed  of  convergence  of  consensus  algorithms 
[10],  We  refer  readers  to  [8]  for  a  comprehensive  treatment  of 
the  topic. 

Despite  intensive  research  in  the  area,  there  is  an  im¬ 
portant  category  of  problems  that  remain  unsolved:  how  to 
measure  the  quality  of  connectivity  of  a  wireless  multi-hop 

1  In  this  paper,  we  limit  our  discussions  to  a  simple  graph  (network)  where 
there  is  at  most  one  edge  (link)  between  a  pair  of  vertices  (nodes)  and  an 
undirected  graph. 
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network  which  has  a  realistic  number  of  nodes,  not  necessarily 
large  enough  to  warrant  the  use  of  asymptotic  analysis,  and 
has  unreliable  connections,  reflecting  the  inherent  unreliable 
characteristics  of  wireless  communications?  The  quality  of 
connectivity  measures  how  easily  and  reliably  a  packet  sent 
by  a  node  can  reach  another  node.  It  complements  the  use  of 
capacity  to  measure  the  quality  of  a  network  in  saturated  traffic 
scenarios  and  provides  a  native  measure  of  the  quality  of  (end- 
to-end)  network  connections.  In  the  following  paragraphs,  we 
elaborate  on  the  above  question  using  two  examples. 

Example  1:  Consider  a  network  with  a  fixed  number  of 
nodes  with  known  transmission  power  to  be  deployed  in  a 
region.  Assume  that  the  wireless  propagation  model  in  that 
environment  is  known  and  its  characteristics  have  been  quan¬ 
tified  through  a  priori  measurements  or  empirical  estimation. 
Further,  a  link  exists  between  two  nodes  iff  the  received  signal 
strength  from  one  node  at  the  other  node  is  greater  than  or 
equal  to  a  predetermined  threshold  and  the  same  is  also  true 
in  the  opposite  direction.  One  can  then  find  the  probability  that 
a  link  exists  between  two  nodes  at  two  fixed  locations:  It  is 
determined  by  the  probability  that  the  received  signal  strength 
is  greater  than  or  equal  to  the  pre-determined  threshold.  Two 
related  questions  can  be  asked:  a)  If  these  nodes  are  deployed 
at  a  set  of  known  locations,  what  is  the  quality  of  connectivity 
of  the  network,  measured  by  the  probability  that  there  is  a 
path  between  any  two  nodes,  as  compared  to  node  deployment 
at  another  set  of  locations?  b)  How  to  optimize  the  node 
deployment  to  maximize  the  quality  of  connectivity? 

Example  2:  Consider  a  network  with  a  fixed  number  of 
nodes.  The  transmission  between  a  pair  of  nodes  with  a  direct 
connection  quantifying  the  inherent  unreliable  characteristics 
of  wireless  communications.  There  are  no  direct  connections 
between  some  pairs  of  nodes  because  the  probability  of  suc¬ 
cessful  transmission  between  them  is  too  low  to  be  acceptable. 
How  to  measure  the  quality  of  connectivity  of  such  a  network, 
in  the  sense  that  a  packet  transmitted  from  one  node  can  easily 
and  reliably  reach  another  node  via  a  multi-hop  path.  Will  a 
single  “good”  path  between  a  pair  of  nodes  be  more  preferable 
than  multiple  “bad”  paths?  These  are  further  illustrated  using 
Fig.  1  and  2. 

In  this  paper,  we  explore  the  use  of  probabilistic  connec¬ 
tivity  matrix,  a  concept  to  be  defined  later  in  Section  II, 
as  a  tool  to  measure  the  quality  of  network  connectivity. 
Some  interesting  properties  of  the  probabilistic  connectivity 
matrix  and  their  connections  to  the  quality  of  connectivity  are 
demonstrated.  Based  on  the  analysis,  we  show  that  the  largest 
eigenvalue  of  the  probabilistic  connectivity  matrix  can  serve 
as  a  good  metric  of  the  quality  of  network  connectivity. 

The  rest  of  the  paper  is  organized  as  follows.  Section  II 
defines  the  network  settings,  the  probabilistic  connectivity 
matrix  and  gives  a  method  to  compute  the  matrix.  Section 
III  introduces  certain  inequalities  associated  with  the  entries 
of  the  probabilistic  connectivity  matrix.  Section  IV  proves 
several  important  results  about  the  probabilistic  connectivity 
matrix.  These  directly  associate  the  largest  eigenvalue  of  the 
probabilistic  connectivity  matrix  to  the  quality  of  connectivity 
and  expose  a  structure  that  holds  the  promise  of  facilitating 
associated  optimization  tasks.  Section  V  concludes  the  paper 
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Figure  1:  An  illustration  of  networks  with  different  quality 
of  connectivity.  A  solid  line  represents  a  direct  connection 
between  two  nodes  and  the  number  beside  the  line  represents 
the  corresponding  transmission  successful  probability.  The 
networks  shown  in  (a),  (b),  and  (c)  are  all  connected  networks 
but  not  2-connected  networks,  i.e.  their  connectivity  cannot 
be  differentiated  using  the  k-connectivity  concept.  However 
intuitively  the  quality  of  the  network  in  (b)  is  better  than  that  of 
the  network  in  (a)  because  of  the  availability  of  the  additional 
high-quality  link  between  V2  and  V4  in  (b).  The  quality  of 
the  network  in  (c)  is  even  better  because  of  the  availability 
of  the  additional  nodes  and  the  associated  high-quality  links, 
hence  additional  routes,  if  these  additional  nodes  act  as  relay 
nodes  only.  If  these  additional  nodes  also  generate  their  own 
traffic,  it  is  uncertain  whether  the  quality  of  the  network  in 
(c)  is  better  or  not.  Therefore  it  is  important  to  develop  a 
measure  to  quantitatively  compare  the  quality  of  connectivity 
(for  the  networks  in  (a)  and  (b))  and  to  evaluate  the  benefit  of 
additional  nodes  on  connectivity  (for  the  network  in  (c)). 


Figure  2:  The  networks  shown  in  (a)  and  (b)  have  the  same 
topology  but  different  link  quality.  It  is  difficult  to  compare 
the  quality  of  the  two  networks. 


and  discusses  future  work. 

II.  Definition  and  Construction  of  the 
Probabilistic  Connectivity  Matrix 

Consider  a  network  of  n  nodes.  For  some  pair  of  nodes, 
an  edge  (or  link)  may  exist  with  a  non-negligible  probability. 
The  edges  are  undirected  and  independent. 

Denote  the  underlying  graph  of  the  above  network  by 
G(V,E ),  where  V  =  {vi,...,vn}  is  the  vertex  set  and 
E  =  {ei, . . . ,  emj  is  the  edge  set,  which  contains  the  set  of 
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all  possible  edges.  Here  the  vertices  and  the  edges  are  indexed 
from  1  to  n  and  from  1  to  rn  respectively.  For  convenience, 
in  some  parts  of  this  paper  we  also  use  the  symbol  ei3  to 
denote  an  edge  between  vertices  vt  and  v3  when  there  is  no 
confusion.  We  associate  with  each  edge  e,,  i  £  {l,...m}, 
an  indicator  random  variable  I,  such  that  /,  =  1  if  the  edge 
e,;  exists;  /,  =  0  if  the  edge  e,  does  not  exist.  The  indicator 
random  variables  Il3 ,  i  ^  j  and  i,j  £  {1, . .  ,n},  are  defined 
analogously. 

In  the  following,  we  give  a  definition  of  the  probabilistic 
adjacency  matrix: 

Definition  1:  The  probabilistic  adjacency  matrix  of 
G  ( V ,  E),  denoted  by  Aq,  is  a  n  x  n  matrix  such  that  its 
( i,j)th ,  i  ^  j,  entry  a.y  =  Pr  [Il3  =1)  and  its  diagonal 
entries  are  all  equal  to  1. 

Due  to  the  undirected  property  of  an  edge  mentioned  above, 
Ac  is  a  symmetric  matrix,  i.e.  al3  =  a3i.  Note  that  the 
diagonal  entries  of  Ac  are  defined  to  be  1,  which  is  different 
from  that  common  in  the  literature.  This  treatment  of  the 
diagonal  entries  can  be  associated  with  the  fact  that  a  node 
in  the  network  can  store  a  packet  until  better  transmission 
opportunity  arises  when  it  finds  the  wireless  channel  busy  [11]. 

The  probabilistic  connectivity  matrix  is  defined  in  the  fol¬ 
lowing: 

Definition  2:  The  probabilistic  connectivity  matrix  of 
G  ( V ,  E ),  denoted  by  Qg ,  is  a  n  x  n  matrix  such  that  its 
( i,j)th ,  i  ^  j,  entry  is  the  probability  that  there  exists  a  path 
between  vertices  Vi  and  v3 ,  and  its  diagonal  entries  are  all 
equal  to  1. 

As  a  ready  consequence  of  the  symmetry  of  Aq,  Qg  is  also 
a  symmetric  matrix. 

Given  the  probabilistic  adjacency  matrix  Ac,  the  probabilis¬ 
tic  connectivity  matrix  Qg  is  fully  determined.  However  the 
computation  of  Qg  is  not  trivial  because  for  a  pair  of  vertices 
Vi  and  Vj ,  there  may  be  multiple  paths  between  them  and  some 
of  them  may  share  common  edges,  i.e.  are  not  independent. 
In  the  following  paragraph,  we  give  an  approach  to  computing 
the  probabilistic  connectivity  matrix. 

Let  7ro)  be  a  particular  instance  of  the  indicator 

random  variables  associated  with  an  instance  of  the  random 
edge  set.  Let  Qg\  {Ii,  ■  •  • ,  Im)  be  the  connectivity  matrix 
of  G  conditioned  on  {I\, . . . ,  Im).  The  ( i,j)th  entry  of 
Qg\  (Ii,  . . . ,  Im)  is  either  0,  when  there  is  no  path  between 
Vi  and  Vj,  or  1  when  there  exists  a  path  between  Vi  and 
Vj.  The  diagonal  entries  of  Qg\  (Ii,  ■  ■  ■ ,  Im)  are  always  1. 
Conditioned  on  (Ji, . . . ,  Jm),  G  ( V,  E)  is  just  a  deterministic 
graph.  Therefore  the  entries  of  Qq\  (ii, . . . ,  Im)  can  be  effi¬ 
ciently  computed  using  a  search  algorithm,  such  as  breadth- 
first  search.  Given  Qg\  (Ii,  ■  ■  ■ ,  Im),  Qg  can  be  computed 
using  the  following  equation: 

Qg  =  E(Qg\  (1) 

where  the  expectation  is  taken  over  all  possible  instances  of 
{Ill  ■  ■  •  i  Im)- 

The  approach  suggested  in  the  last  paragraph  is  essentially 
a  brute-force  approach  to  computing  Qg-  A  more  efficient 
algorithm  is  suggested  in  Section  IV. 


Remark  1:  A  major  difference  between  the  (probabilistic) 
connectivity  matrix  and  the  adjacency  matrix  (or  the  Laplacian 
matrix)  is  that  the  later  matrix  focuses  on  quantifying  the 
relation  between  node  pairs  directly  connected  by  an  edge  only 
while  the  former  matrix  focuses  on  quantifying  the  end-to- 
end  relationship  between  node  pairs.  It  is  not  trivial  to  obtain 
the  connectivity  matrix  from  the  adjacency  matrix  or  use  the 
adjacency  matrix  to  study  network  properties  easily  obtainable 
using  the  connectivity  matrix. 

Remark  2:  For  simplicity,  the  terms  used  in  our  discussion 
are  based  on  the  problems  in  Example  1.  The  discussion 
however  can  be  easily  adapted  to  the  analysis  of  the  problems 
in  Example  2.  For  example,  if  al3  is  defined  to  be  the 
probability  that  a  transmission  between  nodes  Vi  and  v3  is 
successful,  the  ( i,j)th  entry  of  the  probabilistic  connectivity 
matrix  Qg  computed  using  (1)  then  gives  the  probability  that  a 
transmission  from  Vi  to  Vj  via  a  multi-hop  path  is  successful 
under  the  best  routing  algorithm,  which  can  always  find  a 
shortest  and  error-free  path  from  v.t  to  v3  if  it  exists,  or  alter¬ 
natively,  the  probability  that  a  packet  broadcast  from  Vi  can 
reach  v3  where  each  node  receiving  the  packet  only  broadcasts 
the  packet  once.  Therefore  the  ( i,j)th  entry  of  Qg  can  be  used 
as  a  quality  measure  of  the  end-to-end  paths  between  Vi  and 
Vj,  which  takes  into  account  the  fact  that  availability  of  extra 
paths  between  a  pair  of  nodes  can  be  exploited  to  improve  the 
probability  of  successful  transmissions. 

III.  Some  Key  Inequalities  for  Connection 
Probabilities 

The  entries  of  the  probabilistic  connectivity  matrix  give  a 
measure  of  the  quality  of  end-to-end  paths.  In  this  section, 
we  provide  some  important  inequalities  that  may  facilitate 
further  analysis  of  the  quality  of  connectivity.  Some  of  these 
inequalities  are  exploited  in  the  next  section  to  establish 
several  key  properties  of  the  probabilistic  connection  matrix 
itself.  We  first  introduce  some  results  that  are  required  for  the 
further  analysis  of  the  probabilistic  connectivity  matrix  Qg- 

For  a  random  graph  with  a  given  set  of  vertices,  a  particular 
event  is  increasing  if  the  event  is  preserved  when  more 
edges  are  added  into  the  graph.  An  event  is  decreasing  if  its 
complement  is  increasing. 

Denote  by  Gj  the  event  that  there  is  a  path  between  vertices 
Vi  and  Vj,  i  ^  j.  Denote  by  Gfc.j  the  event  that  there  is  a  path 
between  vertices  Uj  and  Vj  and  that  path  passes  through  the 
third  vertex  Vk,  where  k  £  Fn\  {*,_)}  and  Fn  is  the  set  of 
indices  of  all  vertices.  Denote  by  r/y  the  event  that  there  is 
an  edge  between  vertices  Vi  and  Vj.  Denote  by  n ikj  the  event 
that  there  is  a  path  between  vertices  Uj  and  Vk  and  there  is  a 
path  between  vertices  Vk  and  Vj,  where  k  £  Tn\  {i,  j}.  It  can 
be  shown  from  the  above  definitions  that 

fij  =  Vij  LI  (L ^k^i,jCikj)  (2) 

Let  qijt  i  ^  j,  be  the  {i,j)th  entry  of  QG,  i.e.  ql3  =  Pr  (£„). 

The  following  lemma  can  be  readily  obtained  from  the  FKG 
inequality  [6,  Theorem  1.4]  and  the  above  definitions 
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Lemma  1:  For  two  distinct  indices  i,  j  G  Tn  and  Vfc  G 

r„\ 

qij  >  max  qikqkj  (3) 

fcern\{*,.7} 

Proof:  It  follows  readily  from  the  above  definitions  that 
the  event  is  an  increasing  event.  Using  the  FKG  inequality: 

Pr  {£ij)  >  Pr  (7 vikj)  =  Pr  {flk  n  £kj)  >  Pr  (&k)  Pr  (6y) 

(4) 

■ 

Lemma  1  gives  a  lower  bound  of  qtj .  The  following  lemma 
gives  an  upper  bound  of  qtj : 

Lemma  2:  For  two  distinct  indices  i,j  £  Tn  and  Vfc  £ 

P  n\  {*,.?}> 

q%j  <  1  -  (1  -  aij)  (1  -  q,kqkj)  (5) 

fcern\{i,j} 

where  al3  =  Pr  (%•). 

Proof:  We  will  first  show  that  44  That  is, 

the  occurrence  of  the  event  fikj  is  a  sufficient  and  necessary 
condition  for  the  occurrence  of  the  event  kO£kj,  where  for 
two  events  A  and  B,  ALW  denotes  the  event  that  there  exist 
two  disjoint  sets  of  edges  such  that  the  first  set  of  edges 
guarantees  the  occurrence  of  A  and  the  second  set  of  edges 
guarantees  the  occurrence  of  B. 

Using  the  definition  of  tjlkj,  occurrence  of  f,k:j  means  that 
there  is  a  path  between  vertices  vt  and  Vj  and  that  path 
passes  through  vertex  vk.  It  follows  that  there  exist  a  path 
between  vertex  i  and  vertex  vk  and  a  path  between  vertex 
vk  and  vertex  Vj  and  the  two  paths  do  not  have  edge(s)  in 
common.  Otherwise,  it  will  contradict  the  definition  of  flkj, 
particularly  as  the  definition  of  a  path  requires  the  edges  to  be 
distinct.  Therefore  £ikj  =>  fikD^kj.  Likewise,  flk]  <=  £ikn£kj 
also  follows  directly  from  the  definitions  of  fikj ,  ftk,  tjk/j  and 
Consequently 

Pr  Hikj)  =  Pr  (fikO^kj)  <  Pr  (&k)  Pr  faj)  (6) 

where  the  inequality  is  a  direct  result  of  the  BK  inequality  [6] 

With  a  little  bit  abuse  of  the  terminology,  in  the  following 
derivations  we  also  use  fikj  to  represent  the  set  of  edges  that 
make  the  event  Gkj  happen,  and  use  rjij  to  denote  the  edge 
between  vertices  vt  and  Vj. 

Note  that  the  set  of  edges  Ufcgr „\U,j}£ikj  does  not  contain 
rjij.  Therefore  using  (2)  and  independence  of  edges  (used  in 
the  third  step) 

Qij  =  Pr  (Vij  LI  (Ukgr„\{j,j}^jfci)) 

—  1  Pr  (flij  G  (Ufc^rn\{i,i7}^tA;i?)) 

=  3  (3  ^ij )  Pr 

<  1  —  (1  -aij)  n  Pr(5^) 

fcern\{*,j} 

=  1  -  (1  -<*«)  II  (1-Pr(€i«)) 

kern\{i,j} 

<  1  -  (1  -  a^)  (1  -  qikqkj)  (8) 


where  in  (7),  FKG  inequality  and  the  fact  that  f,k:l  is  a 
decreasing  event  are  used  and  the  last  step  results  due  to  (6). 

■ 

When  there  is  no  edge  between  vertices  v,  and  Vj ,  which  is 
the  generic  case,  the  upper  and  lower  bounds  in  Lemmas  1 
and  2  reduce  to 

max  qikqkj  <  Qij  <  1  -  TT  (1  -  qikQkj)  (9) 

kern\{i,j}  _ 

The  above  inequality  sheds  insight  on  how  the  quality  of 
paths  between  a  pair  of  vertices  is  related  to  the  quality 
of  paths  between  other  pairs  of  vertices.  It  can  be  possibly 
used  to  determine  the  most  effective  way  of  improving  the 
quality  of  a  particular  set  of  paths  by  improving  the  quality 
of  a  particular  (set  of)  edge(s),  or  equivalently  what  can  be 
reasonably  expected  from  an  improvement  of  a  particular  edge 
on  the  quality  of  end-to-end  paths. 

The  following  lemma  further  shows  that  relation  among 
entries  of  the  path  matrix  Qq  can  be  further  used  to  derive 
some  topological  information  of  the  graph. 

Lemma  3:  If  qtJ  =  qikqkj  for  three  distinct  vertices  Uj,  Vj 
and  Vk,  the  vertex  set  V  of  the  underlying  graph  G  ( V ,  E)  can 
be  divided  into  three  non-empty  and  non-intersecting  sub-sets 
V\,  V2  and  V3  such  that  Vi  £  Vj,  Vj  £  V3  and  V2  =  {vk}  and 
any  possible  path  between  a  vertex  in  Vi  and  a  vertex  in  Vj 
must  pass  through  vk.  Further,  for  any  pair  of  vertices  iq  and 
vm,  where  vi  £  Vi  and  vm  £  V3,  qim  =  qikqkm- 

Proof:  Using  (4)  in  the  second  step,  it  follows  that 

qij  =  Pr  ((€ij\rtikj)  U  7r ikj)  =  Pr  {^ij\itikj)  +  Pr  (irikj) 

—  Pr  )  T  qikqkj 

Therefore  qtj  =  qikqkj  implies  that  Pr  {fij\rtikj)  =  0  or 
equivalently  Gj  44-  7 Tikj 

Further,  Pr  (f  ij\T:lkj)  =  0  implies  that  a  possible  path  (i.e. 
a  path  with  a  non-zero  probability)  connecting  vt  and  vk  and 
a  possible  path  connecting  vk  and  Vj  cannot  have  any  edge  in 
common.  Otherwise  a  path  from  V{  to  Vj ,  bypassing  vk,  exists 
with  a  non-zero  probability  which  implies  Pr  {(,ij\£ikj)  >  0. 
The  conclusion  follows  readily  that  if  q,-t  =  Qikqkj  for 
three  distinct  vertices  v,j,  v3  and  vk,  the  vertex  set  V  of  the 
underlying  graph  G  ( V \  E)  can  be  divided  into  three  non¬ 
empty  and  non-overlapping  sub-sets  Vi,  Vj  and  V3  such  that 
v i  £  Vi,  Vj  £  V3  and  V2  =  {i^}  and  a  path  between  a  vertex 
in  Vi  and  a  vertex  in  Vj,  if  exists,  must  pass  through  vk. 

Further,  for  any  pair  of  vertices  vi  and  vrn,  where  Vi  £  V\ 
and  vm  £  V3,  it  is  easily  shown  that  Pr  (fimj^ikm)  =  0. 
Due  to  independence  of  edges  and  further  using  the  fact  that 
Pr  ifimj^ikm)  =  0,  it  can  be  shown  that 

Pr  (&TO)  =  Pr  (7 Tikm)  =  Pr  (£lk  D  £km)  =  Pr  (&k)  Pr  (£km) 

where  the  last  step  results  because  under  the  condition  of 
Pr  (fimj^ikm)  =  0,  a  path  between  vi  and  vk  and  a  path 
between  vk  and  vrn  cannot  possibly  have  any  edge  in  common. 

■ 

An  implication  of  Lemma  3  is  that  for  any  three  distinct 
vertices,  Vi,  v3  and  vk,  if  a  relationship  qrj  =  qikqk3  holds, 
vertex  vk  must  be  a  critical  vertex  whose  removal  will  render 
the  graph  disconnected. 
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IV.  Properties  of  the  Connectivity  Matrix 

Having  established  some  inequalities  obeyed  by  the  entries 
of  Qc,  we  now  turn  to  establishing  a  measure  of  the  quality 
of  network  connectivity.  At  the  core  of  the  development  in 
this  section  is  the  following  result. 

Lemma  4:  Each  off-diagonal  entry  of  the  probabilistic  con¬ 
nectivity  matrix  Qg  is  a  multiaffine2  function  of  atj ,  i  £ 
>  i. 

Proof:  Observe  that  a,,  =  Pr  (p,,)  and  the  events  ijij, 
i  £  { 1 . . . . .  n }  .  j  >  i  are  independent.  The  conclusion 

in  the  lemma  follows  readily  from  the  fact  that  the  event 
associated  with  each  qtj ,  i.e.  there  exists  a  path  between 
vertices  and  Vj,  is  a  union  of  intersections  of  these  events 
r]ij,i£{l,...,n},j>i.  ■ 

Due  to  the  above  multiaffine  property,  for  any  four  positive 
integers  k,l,i,j  £  {l,...n},  where  p  f  q  and  i  j ,  the 
following  holds: 

Qik  =  f  {E\{eij})ai:j  +g{E\{ei:j})  (10) 

where  f  (E\{eij})  and  g  (E\  {e^})  are  non-negative  con¬ 
stants  within  [0, 1]  determined  by  the  state  of  the  set  of  edges 
excluding  eij.  g  (E\{eij})  =  0  implies  that  non-existence  of 
the  edge  e,.;  will  render  the  vertices  V[  and  ty  disconnected. 
f(E\{eij})  =  0  implies  that  the  state  of  the  edge  e.y  is 
irrelevant  for  the  end-to-end  paths  between  Vi  and  Vk-  Further, 
/  (E\  {eij})  can  be  used  to  measure  the  criticality  of  the  edge 
to  the  end-to-end  paths  between  vi  and  Vk- 

Remark  3:  Using  the  multiaffine  property,  a  more  efficient 
algorithm  for  computing  Qg  than  the  one  suggested  earlier 
using  (1)  can  be  constructed.  Particularly,  the  probabilistic 
connectivity  matrix  of  a  network  forming  a  tree  can  be 
easily  computed.  Therefore  the  algorithm  may  start  by  first 
identifying  a  spanning  tree  in  G(V,E)  and  computing  the 
associated  probabilistic  connectivity  matrix.  Then,  the  edges  in 
E  but  outside  the  spanning  tree  can  be  added  recursively  and 
the  corresponding  probabilistic  connectivity  matrix  updated 
using  (10). 

We  comment  later  in  Remark  5  on  how  the  multiaffine 
structure  is  also  potentially  useful  for  performing  some  of  the 
optimization  tasks  inherent  in  maximizing  connectivity,  e.g. 
determination  of  the  link  whose  improvement  will  bring  the 
maximum  benefit  on  connectivity. 

A  very  desirable  property  of  Qg  is  established  below. 

Theorem  1:  The  probabilistic  connectivity  matrix  Qg  is  a 
positive  semi-definite  matrix.  Further,  Qg  is  positive  semi- 
definite  but  not  positive  definite  iff  there  exist  distinct  i,j  £ 
{1,  •  •  •  ,  n},  such  that  ql:i  =  1. 

The  proof  is  omitted  due  to  space  limitation. 

Let  Ai  >  . . .  >  A„  be  the  eigenvalues  of  Qq.  Note  that 
Ai  +  •  •  •  +  A„  =  n.  As  an  easy  consequence  of  Theorem 
1,  n  >  Ai  >  1  and  1  >  A„  >  0.  In  the  best  case,  Qg 
is  a  matrix  with  all  entries  equal  to  1.  Then  Ai  =  n  and 
A2  =  •  •  •  =  An  =  0.  In  the  worst  case,  Qg  is  an  identity 
matrix.  Then  Ai  =  •  •  •  =  A„  =  1.  This  suggests  that  Ai,  i.e. 
the  largest  eigenvalue  of  Qg ,  can  be  used  as  a  measure  of 

2 A  multiaffine  function  is  affine  in  each  variable  when  the  other  variables 
are  fixed. 


quality  of  network  connectivity  and  a  larger  Ai  indicates  a 
better  quality. 

Further,  let  A  be  a  vector  representing  the  number  of 
packets  broadcast  by  each  node  to  the  rest  of  the  network  and 
let  Y  be  a  vector  representing  the  random  number  of  packets 
received  by  each  node.  It  is  obvious  that  £{F|X]  =  QgX 
then  represents  the  expected  number  of  packets  received  by 
each  node.  Using  the  property  that  Qg  is  a  symmetric  matrix, 
it  can  be  shown  that 


max  \\E[Y\X]\\2 

\\X\\2  =  1 

=  max  JxTQlQcX 

||A||2  =  1 

=  \J ^maxiQc)  =  'J  tfnaxiQ  g) 


max  \\QGX\\2 
II  -N  1 1 2 = 1 

max  J XtQqX 
||x||2=l  v 

A  max(Qc) 


where  Amax  (Qg)  is  the  maximum  eigenvalue  of  Qg  and 
|  |X  | 12  denotes  the  L2-norm  or  Euclidean  norm  of  X. 

We  will  make  this  idea  that  Amax  (Qg)  serves  as  a  good 
measure  of  the  quality  of  network  connectivity  more  concrete 
in  the  following  analysis.  We  start  our  discussion  with  a 
connected  network  and  then  extend  to  more  generic  cases. 
We  will  call  a  network  connected  if  for  all  i,j  £  {1,  •  •  •  ,  n}, 
qij  >  0.  Obviously  the  probabilistic  connectivity  matrix  of  a 
connected  network  is  irreducible  [12,  p.  374]  as  all  the  entries 
of  the  matrix  are  non-zero.  As  a  measure  of  the  quality  of 
network  connectivity,  if  the  path  probabilities  qtj  increase, 
the  largest  eigenvalue  of  the  probabilistic  connectivity  matrix 
should  also  increase.  This  is  formally  stated  below: 

Theorem  2:  Fet  G(V,E)  and  G'(V,E')  be  the  underlying 
graphs  of  two  connected  networks  defined  on  the  same  vertex 
set  V  but  with  different  link  probabilities.  Fet  Qg  and  Qc 
be  the  probabilistic  connectivity  matrices  of  G  and  G'  respec¬ 
tively.  If  Qq  —  Qg  is  a  non-zero,  non-negative  matrix3,  then 
-^max  (Qg)  <  A  max  (Qg1)- 
Proof: 

We  need  the  following  lemma  to  prove  Theorem  2. 

Lemma  5:  Suppose  A  =  AT  /  B  if  are  non-negative, 
irreducible,  real  matrices,  and  B  —  A  is  a  non-zero,  non¬ 
negative  matrix.  Then:  Amax  (A)  <  ^max  (B). 

Proof:  Observe  at  least  one  element  of  B  —  A  is  positive. 
From  Perron-Frobenius  theorem  [12,  p.  536],  x  £  R",  the 
eigenvector  corresponding  to  the  largest  eigenvalue  of  A  can 
be  chosen  to  have  all  elements  positive.  Then  the  result  follows 
from  the  fact  that: 


A 


max 


(A)xtx  = 


< 


xT  Ax 

xtBx  —  xt(B  —  A)x 
xT B x  <  Xmax(B)xTx 


as  B  —  A  is  a  non-zero,  non-negative  matrix.  ■ 

Turning  to  the  proof  of  Theorem  2  we  note  that  the  result 
follows  directly  from  Femma  5  and  the  fact  Qg>  and  Qg 
satisfy  the  requirements  of  B  and  A,  respectively. 


If  the  network  is  not  connected,  i.e.  some  entries  of  its 
probabilistic  connectivity  matrix  is  0,  the  network  can  be 

3 A  matrix  is  non-negative  if  all  its  entries  are  greater  than  or  equal  to  0. 
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decomposed  into  disjoint  components.  Let  the  total  number 
of  components  in  the  network  be  Let  Gi  be  the  subgraph 
induced  on  the  set  of  vertices  in  the  ith  component  and  Qq. 
be  the  probabilistic  connectivity  matrix  of  Gi.  It  follows  that 

^max  ( Qg )  =  max  {A  max  (QgJ  ?  *  *  *  i  ^max  (Qck)}  (ID 

We  consider  two  basic  situations:  a)  there  are  increases  in 
some  entries  of  Qq  from  non-zero  values  but  such  increases 
do  not  change  the  number  of  components  in  the  network. 
It  then  follows  easily  from  Theorem  2  that  Amax  ( Qq >)  > 
Ama x(QgJ-  Depending  on  whether  Amax  (Qg')  is  greater 
than  Amax  (Qg)  or  not  however,  Amax  (Qg)  may  or  may  not 
increase,  b)  there  are  increases  in  some  entries  of  Qq  from 
zero  to  non-zero  values  and  such  increases  reduce  the  number 
of  components  in  the  network.  For  situation  b),  we  consider  a 
simplified  scenario  where  increases  in  the  path  probabilities 
merge  two  originally  disjoint  components,  denoted  by  Gi 
and  Gj.  The  more  complicated  scenario  where  increases  in 
the  path  probabilities  join  more  than  two  originally  disjoint 
components  can  be  obtained  recursively  as  an  extension  of  the 
above  simplified  scenario.  Let  G'  be  the  underlying  graph  of 
the  network  after  increases  in  path  probabilities  and  let  G'L  be 
the  subgraph  in  G'  induced  on  the  vertex  set  V,  U  Vj .  Obviously 
Qg:  is  an  irreducible  matrix  and  the  following  result  can  be 
established. 

Lemma  6:  Under  the  above  settings, 

Amax  ^ Qg A  Amax  (  diag  {Qgh  Qgj))  (12) 

The  proof  of  Lemma  6  is  straightforward  and  hence  omitted. 

Thus  indeed  the  largest  eigenvalues  of  the  probabilistic  con¬ 
nection  matrices  associated  with  disjoint  components  measure 
the  quality  of  the  components  connection. 

Remark  4:  To  compare  two  networks  with  different  number 
of  nodes,  the  normalized  maximum  eigenvalue  of  the  proba¬ 
bilistic  connectivity  matrix,  where  the  maximum  eigenvalue  is 
divided  by  the  number  of  nodes,  can  be  used. 

Remark  5:  The  fact  that  the  largest  eigenvalue  of  the  prob¬ 
abilistic  connectivity  matrix  measures  connectivity,  suggests 
the  following  obvious  optimization.  Modify  one  or  more  ay 
under  suitable  constraints  to  maximize  the  largest  eigenvalue 
of  the  probabilistic  connectivity  matrix.  Results  in  [13]  and 
[14]  suggest  that  the  multiaffine  dependence  of  the  qij  on  the 
a,ij  together  with  the  fact  that  Qg  is  positive  semi-definite 
promise  to  facilitate  such  optimization. 

V.  Conclusions  and  Further  Work 

In  this  paper  we  explored  the  use  of  the  probabilistic 
connectivity  matrix  as  a  tool  to  measure  the  quality  of  network 
connectivity.  Some  interesting  properties  of  the  probabilistic 
connectivity  matrix  and  their  connections  to  the  quality  of 
network  connectivity  were  demonstrated.  Particularly,  the  off- 
diagonal  entries  of  the  probabilistic  connectivity  matrix  pro¬ 
vide  a  measure  of  the  quality  of  end-to-end  connections  and  we 
have  also  provided  theoretical  analysis  supporting  the  use  of 
the  largest  eigenvalue  of  the  probabilistic  connectivity  matrix 
as  a  measure  of  the  quality  of  overall  network  connectivity. 


Inequalities  between  the  entries  of  the  probabilistic  connec¬ 
tivity  matrix  were  established.  These  may  provide  insights  into 
the  correlations  between  quality  of  end-to-end  connections. 
Further,  the  probabilistic  connectivity  matrix  was  shown  to 
be  a  positive  semi-definite  matrix  and  its  off-diagonal  entries 
are  multiaffine  functions  of  link  probabilities.  These  two 
properties  are  expected  to  be  very  helpful  in  optimization  and 
robust  network  design,  e.g.  determining  the  link  whose  quality 
improvement  will  result  in  the  maximum  gain  in  network 
quality,  and  determining  quantitatively  the  relative  criticality 
of  a  link  to  either  a  particular  end-to-end  connection  or  to  the 
entire  network. 

The  results  in  the  paper  rely  on  two  main  assumptions: 
the  links  are  symmetric  and  independent.  We  expect  that  our 
analysis  can  be  readily  extended  such  that  the  first  assumption 
on  symmetric  links  can  be  removed  -  in  fact  the  results 
in  Section  III  do  not  need  this  assumption.  While  in  the 
asymmetric  case  the  probabilistic  connectivity  matrix  is  no 
longer  guaranteed  to  be  positive  semi-definite,  we  conjecture 
that  the  largest  eigenvalue  retains  its  significance.  Discarding 
the  second  assumption  requires  more  work.  However,  we  are 
encouraged  by  the  following  observation.  If  we  introduce 
conditional  edge  probabilities  into  the  mix,  then  Qq  is  still 
a  multiaffine  function  of  the  a.y  and  the  conditional  probabil¬ 
ities.  Thus  we  still  expect  all  the  results  in  Section  IV  to  hold, 
though  the  proof  may  be  non-trivial.  In  real  applications  link 
correlations  may  arise  due  to  both  physical  layer  correlations 
and  correlations  caused  by  traffic  congestion. 
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Abstract — Wireless  multi-hop  networks,  in  various  forms 
and  under  various  names,  are  being  increasingly  used  in 
military  and  civilian  applications.  Studying  connectivity  and 
capacity  of  these  networks  is  an  important  problem.  The 
scaling  behavior  of  connectivity  and  capacity  when  the 
network  becomes  sufficiently  large  is  of  particular  interest. 
In  this  paper,  we  briefly  overview  recent  development  and 
discuss  research  challenges  and  opportunities  in  the  area, 
with  a  focus  on  the  network  connectivity.  We  demonstrate 
some  intrinsic  connections  between  the  connectivity  analysis 
and  capacity  analysis  and  point  out  the  fundamental  pa¬ 
rameters  determining  the  capacity  of  a  wireless  multi-hop 
network. 

Index  Terms — Wireless  multi-hop  networks,  connectivity, 
capacity 


I.  Introduction 

Wireless  multi-hop  networks,  in  various  forms,  e.g. 
wireless  sensor  networks,  underwater  sensor  networks, 
vehicular  networks,  mesh  networks  and  UAV  (Unmanned 
Aerial  Vehicle)  formations,  and  under  various  names, 
e.g.  ad-hoc  networks,  hybrid  networks,  delay  tolerant 
networks  and  intermittently  connected  networks,  are  being 
increasingly  used  in  military  and  civilian  applications. 
There  are  three  defining  features  that  characterize  a  wire¬ 
less  multi-hop  network: 

1)  Wireless  devices  are  self-organized  or  assisted  by 
some  infrastructure  to  form  a  network.  The  for¬ 
mer  case  corresponds  to  ad-hoc  networks  whereas 
the  latter  case  corresponds  to  infrastructure-based 
multi-hop  networks.  Depending  on  the  applications, 
the  forms  of  the  infrastructure  can  be  quite  flexi¬ 
ble,  e.g.  a  subset  of  devices  connected  via  wired 
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connections,  a  subset  of  devices  with  more  pow¬ 
erful  transmission  capability  such  that  they  form  a 
wireless  backbone  for  the  network,  or  in  a  UAV 
formation,  the  infrastructure  may  assume  the  form 
of  a  subset  of  UAVs  with  satellite  links. 

2)  Communication  is  mostly  via  wireless  multi-hop 
paths. 

3)  Packets  are  forwarded  collaboratively  from  the 
source  to  the  destination. 

The  implication  of  the  first  feature  is  that  ad-hoc  networks 
form  an  important  special  case  of  wireless  multi-hop 
networks  but  the  concept  of  wireless  multi-hop  networks 
has  much  broader  meaning.  Particularly,  the  prospect  of 
including  infrastructure  into  the  ad-hoc  network  addresses 
shortcomings  of  the  ad-hoc  network  in  scalability  and 
providing  reliable  service.  The  second  feature  sets  wire¬ 
less  multi-hop  networks  apart  from  the  traditional  one- 
hop  networks,  i.e.  cellular  networks  and  wireless  LANs. 
Therefore,  there  is  a  unique  set  of  challenging  problems 
specific  to  wireless  multi-hop  networks.  The  third  feature 
implies  that  collaborative  communication,  either  centrally 
designed  and  operated  or  performed  distributedly,  is  an 
important  consideration  in  wireless  multi-hop  networks. 

Studying  connectivity  and  capacity  of  wireless  multi¬ 
hop  networks  is  an  important  problem  [1] — [3].  The  scaling 
behavior  of  connectivity  and  capacity  when  the  network 
becomes  sufficiently  large  is  of  particular  interest.  In 
this  paper,  we  briefly  overview  recent  development  and 
discuss  research  challenges  and  opportunities  in  the  area, 
with  a  focus  on  the  network  connectivity.  We  also  demon¬ 
strate  some  intrinsic  connections  between  the  connectivity 
results  and  capacity  results  and  point  out  the  fundamental 
parameters  determining  the  capacity  of  a  wireless  multi¬ 
hop  network. 

A  network  is  said  to  be  connected  iff  (if  and  only  if) 
there  is  a  (multi-hop)  path  between  any  pair  of  nodes. 
Further,  a  network  is  said  to  be  /.'-connected  if  there  are 
k  mutually  independent  paths  between  any  pair  of  nodes 
that  do  not  have  any  node  in  common  except  the  starting 
and  the  ending  nodes,  ^-connectivity  is  often  required  for 
robust  operations  of  the  network. 

The  rest  of  the  paper  is  organized  as  follows:  Section 
II  discusses  connectivity  of  large-scale  random  networks; 
Section  III  discusses  connectivity  of  giant  component; 
Section  IV  discusses  recent  development,  research  chal- 


lenges  and  opportunities  in  mobile  networks  and  Section 
V  concludes  the  paper. 

II.  Connectivity  of  Large-Scale  Random 
Networks 

A.  Unit  disk  model  and  connectivity 

Extensive  research  has  been  done  on  connectivity 
problems  using  the  well-known  random  geometric  graph 
and  the  unit  disk  model,  which  is  usually  obtained  by 
randomly  and  uniformly  distributing  n  nodes  in  a  given 
area  and  connecting  any  two  nodes  iff  their  Euclidean 
distance  is  smaller  than  or  equal  to  a  given  threshold 
r(n),  known  as  the  transmission  range  [3],  [4].  Significant 
outcomes  have  been  achieved  for  both  asymptotically 
infinite  n  [1],  [3],  [5]-[10]  and  finite  n  [11]-[14], 

Research  on  the  connectivity  of  large-scale  random  ad- 
hoc  networks  under  the  unit  disk  model  is  spearheaded  by 
Penrose  [15],  [16]  and  Gupta  and  Kumar  [1],  Specifically, 
Penrose  [15],  [16]  and  Gupta  and  Kumar  [1]  proved  using 
different  techniques  that  if  the  transmission  range  is  set 

to  r  (n)  =  yj los  ^c^~,  a  random  network  formed  by 
uniformly  placing  n  nodes  on  a  unit-area  disk  in  ft2  is 
asymptotically  almost  surely  (a.a.s.)  connected  as  n  — >  oo 
iff  c  (n)  -A  oo.  An  event  £„  depending  on  n  is  said  to 
occur  a.a.s.  if  its  probability  tends  to  one  as  n  — >  oo.  Pen¬ 
rose’s  result  is  based  on  the  fact  that  in  the  above  random 
network,  as  n  -A  oo  the  longest  edge  of  the  minimum 
spanning  tree  converges  in  probability  to  the  minimum 
transmission  range  required  for  the  above  random  network 
to  have  no  isolated  nodes  (or  equivalently  the  longest  edge 
of  the  nearest  neighbor  graph  of  the  above  network)  [3], 
[15],  [16].  Gupta  and  Kumar’s  result  is  based  on  a  key 
finding  in  the  continuum  percolation  theory  [17,  Chapter 
6]:  Consider  an  infinite  network  with  nodes  distributed  in 
5ft2  following  a  Poisson  distribution  with  density  p;  and 
a  pair  of  nodes  separated  by  a  Euclidean  distance  x  are 
directly  connected  with  probability  g  independent  of 
the  event  that  another  distinct  pair  of  nodes  are  directly 
connected.  Here,  g  :  5ft +  -a  [0, 1]  satisfies  the  conditions 
of  non-increasing  monotonicity  and  integral  boundedness 
[17,  pp.  151-152],  As  p  — ►  oo,  a.a.s.  the  above  network  in 
5ft2  has  only  a  unique  unbounded  component  and  isolated 
nodes. 

In  [6],  Philips  et  al.  proved  that  the  average  node 
degree,  i.e.  the  expected  number  of  neighbors  of  an 
arbitrary  node,  must  grow  logarithmically  with  the  area 
of  the  network  to  ensure  that  the  network  is  connected, 
where  nodes  are  placed  randomly  on  a  square  according  to 
a  Poisson  point  process  with  a  constant  density.  This  result 
by  Philips  et  al.  actually  provides  a  necessary  condition 
on  the  average  node  degree  required  for  connectivity.  In 
[5],  Xue  et  al.  showed  that  in  a  network  with  a  total 
of  n  nodes  randomly  and  uniformly  distributed  on  a 
unit  square,  if  each  node  is  connected  to  clog n  nearest 
neighbors  with  c  <  0.074  then  the  resulting  random 
network  is  a.a.s.  disconnected  as  n  -A  oo;  and  if  each 
node  is  connected  to  clog?r  nearest  neighbors  with  c  > 


5.1774  then  the  network  is  a.a.s.  connected  as  n  — >  oo. 
In  [8],  Balister  et  al.  advanced  the  results  in  [5]  and 
improved  the  lower  and  upper  bounds  to  0.3043  log  n 
and  0.5139  log  n  respectively.  In  a  more  recent  paper 
[10]  Balister  et  al.  achieved  much  improved  results  by 
showing  that  there  exists  a  constant  ccrit  such  that  if 
each  node  is  connected  to  [clognj  nearest  neighbors 
with  c  <  ccrit  then  the  network  is  a.a.s.  disconnected 
as  n  -A  oo,  and  if  each  node  is  connected  to  [clognj 
nearest  neighbors  with  c  >  ccrjt  then  the  network  is 
a.a.s.  connected  as  n  -A  oo.  In  both  [8]  and  [10],  the 
authors  considered  nodes  randomly  distributed  following 
a  Poisson  process  of  intensity  one  on  a  square  of  area  n.  In 
[7],  Ravelomanana  investigated  the  critical  transmission 
range  for  connectivity  in  3-dimensional  wireless  sensor 
networks  and  derived  similar  results  as  the  2-dimensional 
results  in  [1]. 

In  [12],  Bettstetter  empirically  investigated  the  mini¬ 
mum  node  degree  and  connectivity  of  a  finite  network 
with  n  (100  <  n  <  2000)  nodes  randomly  and  uniformly 
placed  on  a  square  of  area  A.  Tang  et  al.  [13]  proposed 
an  empirical  formula  relating  the  probability  of  having 
a  connected  network  to  the  transmission  range  for  a 
finite  network  with  n  (n  <  125)  nodes  randomly  and 
uniformly  distributed  on  a  unit  square.  Bettstetter  [11] 
studied  the  network  connectivity  considering  different 
node  placement  models,  i.e.  uniform  distribution,  Gaus¬ 
sian  distribution.  Note  that  most  results  for  finite  n  are 
empirical  results. 

B.  More  general  connection  models  and  connectivity 

All  the  work  described  in  the  last  subsection  is  based  on 
the  unit  disk  model.  This  model  may  simplify  analysis  but 
no  real  antenna  has  an  antenna  pattern  similar  to  it.  The 
log-normal  shadowing  connection  model,  which  is  more 
realistic  than  the  unit  disk  model,  has  accordingly  been 
considered  for  investigating  network  connectivity  in  [18]- 
[23],  Under  the  log-normal  shadowing  connection  model, 
two  nodes  are  directly  connected  if  the  received  power  at 
one  node  from  the  other  node,  whose  attenuation  follows 
the  log-normal  model  [24],  is  greater  than  or  equal  to  a 
given  threshold. 

In  [18],  Hekmat  et  al.  proposed  an  empirical  formula 
for  computing  the  average  size  of  the  largest  connected 
component  through  simulations,  where  a  total  of  n  nodes 
are  randomly  and  uniformly  distributed  in  a  bounded  area 
in  -ft2.  In  [22],  Bettstetter  derived  a  lower  bound  on  the 
minimum  node  density  p  required  to  ensure  that  a  network 
with  nodes  Poissonly  distributed  in  an  area  in  5ft2  with 
density  p  is  fc-connected  with  a  high  probability.  The 
analysis  is  based  on  the  observation  that  the  minimum 
node  density  required  for  a  fc-connected  network  is  larger 
than  that  required  for  the  network  to  have  a  minimum 
node  degree  fc,  and  the  assumption  that  the  event  that  a 
node  has  a  degree  greater  than  or  equal  to  fc  is  independent 
of  the  event  that  another  node  has  a  degree  greater  than  or 
equal  to  fc.  Using  simulations,  they  showed  that  the  bound 
is  tight  when  the  node  density  is  sufficiently  large.  Using 


the  same  model  as  in  [22],  Bettstetter  et  al.  obtained  in 
[23]  a  lower  bound  on  the  minimum  node  density  required 
for  an  almost  surely  connected  network  using  essentially 
the  same  technique  as  that  in  [22].  The  analysis  relies  on 
the  assumption  that  the  event  that  a  node  is  isolated  and 
the  event  that  another  node  is  isolated  are  independent, 
hereafter  referred  to  as  the  independence  assumption. 
Orriss  et  al.  [19]  considered  nodes  uniformly  and  ran¬ 
domly  distributed  on  a  plane  and  communicating  with 
each  other  following  the  log-normal  shadowing  model  in 
the  framework  of  cellular  networks.  They  investigated  the 
distribution  of  the  number  of  base  stations  that  commu¬ 
nicate  with  a  given  mobile  and  found  that  the  number  of 
base  stations  able  to  communicate  with  a  given  mobile 
and  lying  within  a  specified  range  of  the  mobile  follows 
a  Poisson  distribution.  In  [21],  Miorandi  et  al.  presented 
an  analytical  procedure  for  computing  the  node  isolation 
probability  in  the  presence  of  channel  randomness,  where 
nodes  are  distributed  following  a  Poisson  point  process 
in  SR2  (which  extends  their  earlier  work  in  [20]).  They 
further  obtained  an  estimate  of  the  probability  that  there 
is  no  isolated  node  in  the  network  based  on  the  above 
independence  assumption.  The  previous  results  in  [18]- 
[23]  dealing  with  a  necessary  condition  on  the  critical 
transmission  power  for  connectivity  under  the  log-normal 
shadowing  model  all  rely  on  the  independence  assumption 
that  the  node  isolation  events  are  independent,  which  has 
only  been  validated  by  simulations.  Realistically  however, 
one  may  expect  that  the  event  that  a  node  is  isolated  and 
the  event  that  another  node  is  isolated  will  be  correlated 
whenever  there  is  a  non-zero  probability  that  a  third 
node  may  exist  which  may  have  direct  connections  to 
both  nodes.  In  the  unit  disk  model,  this  may  happen 
when  the  transmission  range  of  the  two  nodes  overlaps. 
In  the  log-normal  model,  any  node  may  have  a  non¬ 
zero  probability  of  having  direct  connections  to  both 
nodes.  This  observation  and  a  lack  of  rigorous  analysis 
on  the  node  isolation  events  to  support  the  independence 
assumption  raised  a  question  mark  over  the  validity  of  the 
results  of  [18]-[23]. 

Other  work  in  the  area  includes  [25]-[28],  which  stud¬ 
ies  from  the  percolation  perspective,  the  impact  of  mutual 
interference  caused  by  simultaneous  transmissions,  the 
impact  of  physical  layer  cooperative  transmissions,  the 
impact  of  directional  antennas  and  the  impact  of  unreli¬ 
able  links  on  connectivity  respectively. 


C.  Random  connection  model  and  connectivity 

In  the  more  recent  work  [29],  [30],  the  authors  consid¬ 
ered  a  network  where  all  nodes  are  distributed  on  a  unit 
square  A  =  [— 1,|]  following  a  Poisson  distribution 
with  known  density  p  and  a  pair  of  nodes  are  directly 
connected  following  a  random  connection  model,  viz.  a 
pair  of  nodes  separated  by  a  Euclidean  distance  x  are 
directly  connected  with  probability  gTp  (x)  =  g(jr^j, 
where  g  :  [0,  oo)  —y  [0, 1],  independent  of  the  event  that 


another  pair  of  nodes  are  directly  connected.  Here 


and  b  is  a  constant.  The  function  g  is  required  to  satisfy 
the  properties  of  non-increasing  monotonicity  and  integral 
boundedness  [17],  [31,  Chapter  6],  Further,  it  is  required 
that  g  satisfies  the  more  restrictive  requirement  that 

g  (x)  =  ox  f  1  2  )  (2) 

\  xz  log  x  J 

in  order  for  the  impact  of  the  truncation  effect,  which 
accounts  for  the  difference  between  an  infinite  network 
and  a  finite  (or  asymptotically  infinite)  network,  on  con¬ 
nectivity  to  be  asymptotically  vanishingly  small  [29], 
Denote  the  above  network  by  Q  (Xp,gr  ,  A). 

A  number  of  results  were  obtained  based  on  the  con¬ 
nectivity  of  Q  (Xp, gVp,  A): 

1)  Using  the  Chen-Stein  technique  [32],  [33],  it  was 
shown  that  as  p  — >  oo,  the  distribution  of  the 
number  of  isolated  nodes  in  Q  (Xp,  grp,  A)  asymp¬ 
totically  converges  to  a  Poisson  distribution  with 
mean  e~b\ 

2)  The  number  of  isolated  nodes  due  to  the  boundary 
effect  in  Q  ( Xp,grp,A )  is  a.a.s.  zero,  i.e.  the  bound¬ 
ary  effect  has  asymptotically  vanishing  impact  on 
the  number  of  isolated  nodes; 

3)  As  p  — >■  oo,  the  number  of  components  of  finite 
order  k  >  1  in  Q  (Xp,  grp,  A)  asymptotically  van¬ 
ishes; 

4)  As  p  —y  oo,  the  number  of  components  in 
G(Xp,grp,A)  of  unbounded  order  converges  to 
one; 

5)  Finally,  based  on  the  above  results,  it  was  shown 
that  as  p  —y  oo,  a.a.s.  there  are  only  a 
unique  unbounded  component  and  isolated  nodes 
in  Q  (Xp,  gTp ,  A),  and  a  sufficient  and  necessary 
condition  for  Q  {Xp,grp,  A)  to  be  a.a.s.  connected 
is  that  there  is  no  isolated  node  in  the  network.  Fur¬ 
ther,  the  probability  that  G  (Xp,  grpl  A)  has  no  iso¬ 
lated  nodes  and  the  probability  that  Q  [Xp,  grp,  A) 
forms  a  connected  network  both  converge  to  e~e 
as  p  — >  oo.  As  a  ready  consequence  of  these  results, 
G  (Xp,  grp,  A)  is  a.a.s.  connected  iff  b  — >  oo  as 
p  -y  oo;  and  is  a.a.s.  disconnected  iff  b  —y  — oo  as 
p  — >  oo. 

The  above  results  extend  the  earlier  work  by  Penrose  [15], 
[16]  and  Gupta  and  Kumar  [1]  from  the  unit  disk  model 
to  the  more  generic  random  connection  model  and  bring 
theoretical  research  in  the  area  closer  to  reality.  It  can  be 
readily  shown  that  the  results  on  the  random  connection 
model  include  the  work  of  Penrose  [15],  [16]  and  Gupta 
and  Kumar  [1]  on  the  unit  disk  model  and  the  work  on 
the  log-normal  model  [  1 8]— [23]  as  two  special  cases. 

D.  Challenges 

There  remain  significant  challenges  ahead. 


Most  results  in  the  area,  including  the  results  in  [29], 
[30],  rely  on  three  main  assumptions:  a)  the  connection 
function  g  is  isotropic,  b)  the  connections  are  independent, 
c)  nodes  are  Poissonly  or  uniformly  distributed. 

We  conjecture  that  assumption  a)  is  not  a  critical 
assumption,  i.e.  under  some  mild  conditions,  e.g.  nodes 
are  independently  and  randomly  oriented,  assumption  a) 
can  be  removed  while  the  above  results,  particularly  the 
ones  obtained  assuming  a  random  connection  model,  are 
still  valid.  It  however  remains  to  validate  the  conjecture. 

The  above  results  however  critically  rely  on  assumption 
b),  which  is  not  necessarily  valid  in  some  networks  due 
to  channel  correlation  and  interference,  where  the  latter 
effect  makes  the  connection  between  a  pair  of  nodes 
dependent  on  the  locations  and  activities  of  other  nearby 
nodes.  In  [34],  some  preliminary  work  was  conducted 
on  the  connectivity  of  CSMA  networks  considering  the 
impact  of  interference.  The  work  essentially  uses  a  de¬ 
coupling  approach  to  solve  the  challenges  of  connection 
correlation  caused  by  interference  by  developing  an  upper 
bound  on  the  interference  experienced  by  any  receiver 
in  the  network  and  then  studying  the  connectivity  of  the 
CSMA  network  using  the  bound.  Their  results  suggest 
that  when  some  realistic  constraints  are  considered,  i.e. 
carrier-sensing,  the  connectivity  results,  e.g.  transmission 
power  required  for  a  connected  network,  will  be  very 
close  to  those  obtained  under  a  unit  disk  model.  This 
conclusion  is  in  stark  contrast  with  that  obtained  under 
an  ALOHA  multiple-access  protocol  [25].  Other  work  in 
the  area  includes  the  work  of  Haenggi  and  his  colleagues 
(see  e.g.  [2],  [35]),  which  characterizes  various  properties 
of  multi-hop  networks  subject  to  interference  by  using 
Poisson  distribution  to  approximate  the  distribution  of  the 
set  of  concurrent  transmitters.  The  major  obstacle  in  deal¬ 
ing  with  the  impact  of  channel  correlation  is  that  there  is 
no  widely  accepted  model  in  the  wireless  communication 
community  capturing  the  impact  of  channel  correlation 
on  connections. 

Finally,  it  is  a  logical  move  after  the  above  work  to 
consider  connectivity  of  networks  with  nodes  distributed 
following  a  generic  distribution  other  than  Poisson  or 
uniform.  This  remains  a  major  challenge  in  the  area. 

III.  Connectivity  of  Giant  Component 

A  giant  component  is  a  component  with  a  designated 
large  percentage  of  nodes  in  the  network,  say  p  where 
0.5  <  p  <  1.  A  component  is  a  maximal  set  of  nodes 
where  there  is  a  path  between  any  pair  of  nodes  in  the 
set. 

Results  on  connectivity  of  large-scale  random  networks 
under  both  the  unit  disk  model  [1],  [15],  [16]  and  the  more 
generic  random  connection  model  [29],  [30]  revealed  the 
same  scaling  law.  That  is,  when  the  number  of  nodes,  de¬ 
noted  by  n,  in  a  network  increases,  the  transmission  range 
(or  power)  has  to  increase  at  a  rate  to  maintain  an  average 
node  degree  of  0  (log  n)  in  order  to  achieve  connectivity. 
For  two  functions  f  (x)  and  h(x),  f  {x)  =  Q(h(x)) 
iff  there  exist  a  sufficiently  large  Xq  and  two  positive 


constants  c\  and  C2  such  that  for  any  x  >  Xq,  c\h  {x)  > 
f  (x)  >  C2h(x).  For  example,  the  critical  transmission 


range  for  connectivity  is  r  (n)  =  y°gn^nC<'n'>  under 
the  unit  disk  model  for  a  random  network  formed  by 
uniformly  placing  n  nodes  on  a  unit-area  disk  [1],  [15], 
[16]1.  In  other  words,  a  connected  network  poses  a  very 
demanding  requirement  on  the  transmission  range  (or 
power).  This  in  turn  causes  many  undesirable  effects  on 
increased  interference  and  reduced  throughput. 

In  the  following,  we  demonstrate  the  connections  be¬ 
tween  the  results  on  connectivity  and  the  results  on  net¬ 
work  capacity.  In  [36],  it  was  shown  that  the  end-to-end 
throughput  between  a  randomly  chosen  source-destination 
pair  in  the  above  network  is  0  (  ,  ) ,  where  W  is  the 

r  \  v  n  log  n  J 

link  capacity.  Refer  to  [36]  for  a  rigorous  definition  of  net¬ 
work  capacity.  This  result  on  capacity  can  be  intuitively 
explained  using  the  results  on  connectivity  as  follows:  as 
the  number  of  nodes  n  increases,  the  average  distance, 
measured  by  the  number  of  hops,  between  a  randomly 
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chosen  pair  of  nodes  is  0  j  =  0 
is,  for  a  typical  node,  for  every  packet  transmitted  for 
itself,  there  are  0  (y\J  relay  packets  transmitted  for 
other  source-destination  pairs.  Further,  the  average  node 
degree  is  mrr2  (n)  =  0(logn),  which  implies  that  in 
a  neighborhood  of  a  typical  node,  at  any  time  there  can 
only  be  one  out  of  every  0  (log  r?)  nodes  active.  It  follows 
that  the  end-to-end  throughput  between  a  typical  source- 


destination  pair  is 
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hence  comes  the  result  in  [36].  The  above  result  can 
be  more  rigorously  derived  by  following  similar  steps 
as  those  in  [36]  on  analyzing  the  capacity  of  a  random 
network.  Therefore  the  reduced  capacity  as  n  — >  oo  is 
attributable  to  the  more  demanding  requirement  on  the 
transmission  range  (or  power)  to  maintain  connectivity  as 
n  —>  oo. 

The  above  observation  motivates  a  question:  since  the 
network  connectivity  is  a  very  demanding  requirement, 
whether  there  is  any  benefit  in  backing  down  from  such  a 
demanding  requirement  and  requiring  most  nodes,  instead 
all  nodes,  to  be  connected? 

Indeed  in  many  applications,  it  is  unnecessary  for 
all  nodes  to  be  always  connected  to  each  other  [37], 
Examples  of  such  applications  include  a  wireless  sensor 
network  for  habitat  monitoring  [38],  [39]  or  environmen¬ 
tal  monitoring  [40],  [41]  and  a  mobile  ad-hoc  network  in 
which  users  can  tolerate  short  off-service  intervals  [42]. 

In  environmental  monitoring,  there  are  scenarios  where 
the  size  of  the  monitored  phenomenon  is  very  large  (e.g. 
rain  clouds)  or  the  parameters  (e.g.  temperature,  humidity) 
that  are  monitored  change  slowly  both  in  space  and  in 
time.  When  the  number  of  nodes  for  monitoring  the 
phenomenon  or  measuring  the  parameters  is  very  large. 


1  By  scaling,  it  can  be  shown  that  assuming  an  extended  network 
model  where  nodes  are  distributed  on  a  disk  of  area  n  with  a  constant 
density  of  1  node  per  unit  area,  the  critical  transmission  range  for 
connectivity  is  r  (n)  =  yj  lo8  K+g0) 


having  a  few  disconnected  nodes  will  not  cause  a  sta¬ 
tistically  significant  change  in  the  monitored  parameters. 
One  example  of  such  applications  is  a  wireless  sensor 
network  that  was  deployed  underneath  the  Briksdalsbreen 
glacier  in  Norway  to  monitor  the  pressure,  humidity, 
and  temperature  of  ice  to  understand  glacial  dynamics  in 
response  to  climate  change  [40],  In  habitat  monitoring, 
there  are  scenarios  where  the  number  of  objects  (e.g. 
zebras  and  cane  toads  [38])  that  are  monitored  is  large. 
Having  a  few  nodes  disconnected  or  lost  may  not  signifi¬ 
cantly  affect  the  accuracy  of  the  monitored  parameter.  In 
many  mobile  ad-hoc  networks,  having  a  number  of  nodes 
temporarily  disconnected  is  also  not  critical,  as  long  as 
users  can  tolerate  short  off-service  intervals.  For  example, 
in  a  campus-wide  wireless  network,  students  and  staff 
can  share  information  using  wireless  devices  (e.g.  laptops 
and  personal  digital  assistants)  around  the  campus  [42], 
When  a  wireless  device  temporarily  loses  connection,  it 
can  store  the  data  and  complete  the  work  after  becoming 
connected  later. 

In  [43],  [44],  considering  a  network  with  a  total  of  n 
nodes  uniformly  and  i.i.d.  on  a  unit  square  in  K2,  it  was 
shown  analytically  that  under  both  the  unit  disk  model 
[44]  and  the  log-normal  model  [43],  the  transmission 
range  (or  power)  required  for  having  a  designated  large 
percentage  of  nodes  connected,  say  p  where  0.5  <  p  < 
1,  is  asymptotically  vanishingly  small  compared  to  that 
required  for  having  a  connected  network,  irrespective  of 
the  value  of  p.  This  result  implies  that  significant  energy 
savings  can  be  achieved  if  we  require  only  most  nodes 
(e.g.  95%,  99%)  to  be  connected,  instead  of  requiring  all 
nodes  to  be  connected;  given  a  network  with  most  nodes 
connected,  a  sharp  increase  in  the  transmission  range  (or 
power)  is  required  to  connect  the  few  remaining  hard- 
to-reach  nodes;  and  the  transmission  range  (or  power) 
required  for  a  large  network  to  be  connected  is  dominated 
by  these  hard-to-reach  nodes  or  rare  events.  It  was  further 
shown  using  simulations  that  under  the  unit  disk  model, 
in  a  network  with  1000  nodes,  the  transmission  range 
required  for  having  95%  nodes  connected  is  only  76%  of 
that  required  for  having  all  nodes  connected.  Based  on  a 
conservative  estimate  that  the  required  transmission  power 
increases  with  the  square  of  the  required  transmission 
range,  an  energy  saving  of  at  least  42%  can  be  achieved  by 
sacrificing  5%  of  nodes.  That  energy  saving  will  further 
increase  with  an  increase  in  the  number  of  nodes  in  the 
network.  Other  benefits  of  the  reduced  transmission  range 
or  power  requirement  is  the  reduced  interference,  hence 
better  throughput. 

It  remains  to  find  the  value  of  the  transmission  range 
(or  power)  required  for  guaranteeing  a  designated  large 
percentage  of  nodes  to  be  connected  in  a  large  scale 
network.  This  problem  has  some  intrinsic  connections 
to  the  problem  of  finding  the  percolation  probability  in 
the  continuum  percolation  theory  [17],  which  is  a  well- 
known  open  problem  in  the  area.  Further,  it  remains  to 
quantitatively  characterize  the  benefit  in  capacity  due  to 
the  reduced  transmission  range  (or  power)  required  for  a 


giant  component. 

Other  researchers  approached  the  problem  caused  by 
the  demanding  requirement  of  a  connected  network  on  the 
transmission  range  (or  power)  from  a  different  perspective 
and  considered  the  use  of  infrastructure  instead.  Here  the 
infrastructure  can  be  quite  flexible.  It  can  be  a  subset 
of  nodes  connected  through  wired  connections  [45],  or  a 
subset  of  nodes  with  possibly  more  powerful  transmission 
capability  that  forms  a  wireless  backbone  of  the  network 
[46],  [47],  or  a  subset  of  nodes  with  satellite  links  as  one 
would  possibly  encounter  in  UAV  formations  [48].  The 
use  of  infrastructure  does  not  change  the  wireless  multi¬ 
hop  nature  of  the  end-to-end  communication,  instead  the 
infrastructure  assists  the  end-to-end  communication  by 
leapfrogging  some  long  hops  and  reducing  the  number 
of  hops  between  two  nodes,  hence  improving  the  per¬ 
formance.  Accordingly  the  concept  of  k-hop  connected 
networks  was  proposed  and  investigated  [49]-[52],  In  a 
k-hop  connected  network,  the  maximum  number  of  hops 
between  any  two  nodes  is  smaller  than  or  equal  to  k. 
Some  research  in  the  area  was  also  conducted  under  the 
name  of  hybrid  networks  [45],  [53], 

Despite  previous  research  in  the  area  of  hybrid  net¬ 
works  or  k-hop  connected  networks,  no  conclusive  results 
have  been  obtained  yet  on  the  role  of  infrastructure  in 
wireless  multi-hop  networks  with  many  problems  remain 
unanswered.  Some  examples  include:  for  randomly  de¬ 
ployed  infrastructure  nodes  and  “ordinary”  nodes,  how 
many  infrastructure  nodes  (versus  ordinary  nodes)  are 
required  for  a  k-hop  connected  network;  for  determinis¬ 
tically  deployed  infrastructure  nodes  and  randomly  de¬ 
ployed  ordinary  nodes,  how  many  infrastructure  nodes 
are  required  for  a  k-hop  connected  network  and  what  is 
the  optimum  deployment  of  infrastructure  nodes;  how  to 
combine  the  use  of  infrastructure-based  communications 
and  ad-hoc  communications  in  one  network  in  order  to 
provide  some  performance  guarantee,  in  terms  of  capacity 
or  delay.  These  problems  are  important  for  wireless  multi¬ 
hop  networks  to  provide  reliable  services,  particularly  for 
wireless  vehicular  networks  in  which  both  infrastructure- 
based  communications  and  ad-hoc  communications  will 
co-exist  [54], 

IV.  Development  and  Challenges  in  Mobile 
Networks 

In  [55],  Grossglauser  and  Tse  studied  the  capacity  of 
mobile  ad-hoc  networks.  Particularly,  they  considered  a 
network  with  a  total  of  n  nodes  distributed  on  a  unit-area 
disk,  the  trajectories  of  different  nodes  are  i.i.d.  and  the 
nodes’  movement  is  such  that  the  spatial  distribution  of 
nodes  are  stationary  and  ergodic  with  stationary  uniform 
distribution  on  the  disk.  They  showed  that  in  the  above 
network  with  unbounded  delay  requirement ,  the  through¬ 
put  between  a  randomly  chosen  source-destination  pair 
can  be  kept  constant  even  as  n  increases.  This  result  is 
in  stark  contrast  with  its  counter-part  in  static  networks 
in  which  the  throughout  between  a  randomly  chosen 
source-destination  pair  is  shown  to  be  0  ( ^/^log  «  )  [36]. 


Following  the  seminal  work  of  Grossglauser  and  Tse, 
other  researchers  have  conducted  further  research  trying 
to  quantitatively  characterize  the  relationship  between 
delay,  mobility  and  capacity  in  mobile  ad-hoc  networks 
[49],  [56]— [59]  and  the  obtained  results  vary  greatly  with 
the  mobility  models  and  network  settings. 

A  fundamental  reason  why  mobility  increases  through¬ 
put  is  that  in  mobile  networks  message  transmissions 
generally  follow  the  store-carry-forward  pattern  versus  the 
store-forward  pattern  found  in  static  networks.  As  nodes 
move,  new  opportunity  may  arise  such  that  a  mobile  node 
can  carry  the  message  until  it  meets  a  node,  which  is  in 
a  better  position  than  itself  to  transmit  the  message  to  the 
destination,  or  until  it  meets  the  destination  directly.  In 
this  way,  the  number  of  relay  nodes  (number  of  hops) 
involved  in  transmitting  a  message  to  its  destination  can 
be  greatly  reduced  and  the  required  transmission  range 
(or  power)  for  a  node  to  reach  another  node  via  a  multi¬ 
hop  path  can  also  be  greatly  reduced,  hence  the  benefit 
in  improved  capacity.  The  cost  in  achieving  this  benefit 
in  capacity  is  the  increased  delay. 

Following  the  techniques  demonstrated  in  Section  III, 
the  results  in  [55]  on  the  capacity  of  mobile  networks  can 
also  be  obtained  as  follows.  In  the  network  model  con¬ 
sidered  in  [55],  a  two-hop  relaying  strategy  is  employed. 
Therefore  as  the  number  of  nodes  n  increases,  the  average 
distance,  measured  by  the  number  of  relay  hops,  between 
a  randomly  chosen  pair  of  source-destination  is  bounded 
by  2.  Further,  the  value  of  the  transmission  range  r  does 
not  have  to  increase  with  n  because  connectivity  (in  the 
sense  that  every  pair  of  nodes  can  exchange  packets) 
can  be  achieved  through  node  movement.  Therefore  the 
average  node  degree  is  also  0  (1).  It  then  readily  follows 
that  the  capacity  of  the  mobile  network  is  0(1).  The  end- 
to-end  delay  can  be  analyzed  by  evaluating  the  time  that 
two  balls  with  radius  equal  to  r / 2,  representing  the  source 
and  the  relay  node  respectively,  hit  a  (randomly  chosen) 
third  ball,  representing  the  destination,  with  radius  r/ 2. 

The  above  observation  motivates  us  to  conclude  that 
the  fundamental  factors  that  determine  the  capacity  of  a 
mobile  (or  static)  network  are: 

1)  The  expected  number  of  simultaneously  active 
transmissions.  This  is  further  determined  by  the 
spatial  node  distribution  and  the  transmission  range 
(or  power). 

2)  The  average  number  of  relay  hops  between  a  source 
and  its  destination.  It  determines  the  average  num¬ 
ber  of  times  that  a  packet  need  to  be  transmitted 
before  reaching  its  destination,  i.e.  the  transmission 
capacity  consumed  for  an  end-to-end  transmission. 

Therefore  the  only  difference  between  mobile  and  static 
networks  is  that  some  part  of  the  job  involved  in  moving 
a  packet,  originally  taken  care  of  entirely  by  wireless 
transmissions  in  static  network,  can  now  be  taken  care  of 
by  the  physical  movement  of  nodes  in  mobile  networks. 
By  viewing  both  the  physical  movements  of  nodes  and  the 
wireless  transmission  as  simply  a  way  to  move  packets 
physically  over  a  distance,  a  unified  theory  for  analyzing 


the  performance  of  both  mobile  and  static  networks  can 
be  established. 

By  analogy,  mobility  can  also  improve  connectivity. 
There  are  three  fundamental  differences  between  mobile 
networks  and  static  networks  [60]  from  a  graph  theory 
perspective:  in  mobile  networks 

•  the  wireless  link  between  two  directly  connected 
nodes  and  the  end-to-end  path  only  exists  temporar¬ 
ily; 

•  two  nodes  may  never  be  part  of  the  same  connected 
component  but  they  are  still  able  to  communicate, 
i.e.  exchange  messages,  with  each  other;  and 

•  while  any  one  wireless  link  may  be  (or  assumed  to 
be)  undirectional,  the  path  connecting  any  two  nodes 
is  directional,  i.e.  there  is  a  path  from  node  Vi  to 
node  Vj  within  a  designated  time  period  does  not 
necessarily  mean  there  is  a  path  from  Vj  to  Vi  within 
the  same  period. 

These  are  illustrated  in  Fig.  1.  Particularly  the  last 
difference  implies  that  it  is  important  to  consider  the 
temporal  order  of  links  when  analyzing  mobile  networks, 
which  has  been  incorrectly  neglected  in  some  previous 
work. 

Due  to  these  differences,  many  established  concepts  in 
static  networks  must  be  revisited  for  mobile  networks.  For 
example,  a  static  wireless  multi-hop  network  is  said  to  be 
connected  iff  there  is  a  path  between  any  pair  of  nodes 
in  the  network.  However  a  more  meaningful  definition  of 
connectivity  in  mobile  networks  is  to  say  that  a  mobile 
network  is  connected  in  time  period  [0,  T)  if  any  node  can 
exchange  a  message  with  any  other  node  within  [0,  T], 
The  above  definition  implies  that  the  tradeoff  between 
connectivity,  mobility  and  delay  is  the  prime  issue  when 
analyzing  the  connectivity  of  mobile  networks.  Despite 
intensive  research  on  the  properties  of  mobile  networks, 
no  conclusive  results  have  been  obtained  on  the  above 
problem  and  it  remains  a  major  challenge  in  the  area. 

V.  Summary 

Wireless  multi-hop  networks  have  attracted  significant 
research  interest.  This  interest  is  expected  to  grow  further 
with  the  proliferation  of  applications,  particularly  in  the 
areas  of  wireless  vehicular  networks  and  sensor  networks. 
In  this  paper,  we  briefly  overviewed  recent  development 
and  discussed  research  challenges  and  opportunities  in  the 
area  mainly  from  the  perspective  of  network  connectivity. 
We  also  showed  how  the  results  on  network  connectivity 
are  related  to  the  study  of  other  performance  metrics, 
i.e.  capacity  and  delay.  We  pointed  out  the  fundamental 
parameters  determining  the  capacity  of  a  wireless  multi¬ 
hop  network  and  the  fundamental  difference  between 
mobile  and  static  networks. 
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Abstract— IEEE  802.11p  and  1609  standards  are  currently 
under  development  to  support  Vehicle-to-Vehicle  and  Vehicle- 
to-Infrastructure  communications  in  vehicular  networks.  For 
infrastructure-based  vehicular  relay  networks,  access  probability 
is  an  important  measure  which  indicates  how  well  an  arbitrary 
vehicle  can  access  the  infrastructure,  i.e.  a  base  station  (BS). 
On  the  other  hand,  connectivity  probability,  i.e.  the  probabil¬ 
ity  that  all  the  vehicles  are  connected  to  the  infrastructure, 
indicates  the  service  coverage  performance  of  a  vehicular  re¬ 
lay  network.  In  this  paper,  we  develop  an  analytical  model 
with  a  generic  radio  channel  model  to  fully  characterize  the 
access  probability  and  connectivity  probability  performance  in 
a  vehicular  relay  network  considering  both  one-hop  (direct 
access)  and  two-hop  (via  a  relay)  communications  between  a 
vehicle  and  the  infrastructure.  Specifically,  we  derive  close-form 
equations  for  calculating  these  two  probabilities.  Our  analytical 
results,  validated  by  simulations,  reveal  the  tradeoffs  between 
key  system  parameters,  such  as  inter-BS  distance,  vehicle  density, 
transmission  ranges  of  a  BS  and  a  vehicle,  and  their  collective 
impact  on  access  probability  and  connectivity  probability  under 
different  communication  channel  models.  These  results  and  new 
knowledge  about  vehicular  relay  networks  will  enable  network 
designers  and  operators  to  effectively  improve  network  planning, 
deployment  and  resource  management. 

Index  Terms— Vehicular  Ad  Hoc  Network  (VANET),  Wireless 
Access  in  Vehicular  Environments  (WAVE),  IEEE  802.11p,  IEEE 
1609,  access  probability,  connectivity,  relay. 
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I.  Introduction 

EHICULAR  ad-hoc  network  (VANET)  is  a  type  of 
promising  application-oriented  network  deployed  along 
a  highway  for  safety  and  emergency  information  delivery  (for 
drivers),  entertainment  content  distribution  (for  passengers), 
and  data  collection  and  communication  (for  road  and  traffic 
managers).  VANET  is  a  hybrid  wireless  network  that  sup¬ 
ports  both  infrastructure-based  and  ad  hoc  communications. 
Specifically,  vehicles  on  the  road  can  communicate  with  each 
other  through  a  multi-hop  ad  hoc  connection.  They  can  also 
access  the  Internet  and  other  broadband  services  through 
the  roadside  infrastructure,  i.e.  base  stations  (BSs)  or  access 
points  (APs)  along  the  road.  When  a  vehicle  moves  out  of 
the  radio  coverage  area  of  a  BS,  e.g.  it  is  located  in  the 
coverage  gap  between  two  adjacent  BSs,  it  will  identify  and 
use  its  neighboring  vehicles  (if  any)  as  relays  to  access  the 
roadside  infrastructure.  These  types  of  Vehicle  to  Vehicle 
(V2V)  and  Vehicle  to  Infrastructure  (V2I)  communications 
have  recently  received  significant  interests  from  both  academia 
and  industry  [1],  [2],  [3].  V2V  communication  has  so  far 
been  envisioned  for  supporting  safety  and  traffic  management 
applications.  With  better  sensing  and  data  communication 
techniques,  drivers  can  share  the  information  such  as  slippery 
road,  poor  visibility,  sudden  stop  and  road  congestion  with 
each  other.  Hence  warnings  are  provided  to  prevent  accidents 
and  improve  road  safety. 

As  shown  in  Fig.  1,  IEEE  802.1  lp  standard  cooperates  with 
the  IEEE  1609  standard  family,  which  is  developed  to  support 
Wireless  Access  in  Vehicular  Environment  (WAVE)  and  to 
deliver  safety  and  infotainment  applications  to  vehicles  on  the 
road  [4],  [5].  Specifically,  IEEE  802. lip  is  a  draft  amendment 
(with  WAVE  capability)  to  the  IEEE  802.11  standards,  which 
is  expected  to  be  finalized  and  approved  in  2010.  The  goal  of 
802. lip  standard  is  to  provide  V2V  and  V2I  communications, 
up  to  a  range  of  1km,  at  an  average  data  rate  of  6  Mbps  over 
the  dedicated  5.9  GHz  (5.85-5.925  GHz)  licensed  frequency 
band.  IEEE  802. lip  uses  an  amended  802.11a  physical-layer 
specification  with  Orthogonal  Frequency-Division  Multiplex¬ 
ing  (OFDM)  technique.  On  Medium  Access  Control  (MAC) 
layer,  it  adopts  the  Enhanced  Distributed  Channel  Access 
(EDCA)  protocol  from  the  802.1  le  standard  to  support  Quality 
of  Service  (QoS).  The  following  standards  are  included  in  the 
IEEE  1609  standard  family:  IEEE  P1609.0,  IEEE  P1609.1, 
IEEE  P1609.2,  IEEE  P1609.3,  IEEE  P1609.4.  Some  new 
standards  have  recently  been  added  to  IEEE  1609  family,  such 
as  1609.5  (Communications  Management),  1609.6  (Facilities) 
and  1609. 1 1  (Electronic  Payment  Service).  Their  functions  and 
relationships  with  other  1609  standards  are  shown  in  Fig.  1. 
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Application:  Electronic  Payment  Service  (1609.1 1) 
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Fig.  1.  Architecture  of  IEEE  1609  standards  family 

IEEE  802. 1  lp  and  1609  standards  are  still  in  the  draft  stage. 
The  harsh  vehicular  communication  environment,  caused  by 
variable  vehicle  speed,  high  mobility,  and  dynamic  network 
topology,  brings  many  technical  challenges  in  developing 
and  deploying  WAVE  applications  and  services.  From  a 
user/vehicle’s  perspective,  the  first,  and  probably  the  most  im¬ 
portant,  service  requirement  is  to  be  able  to  access  the  roadside 
infrastructure,  directly  or  indirectly  via  a  relay  vehicle.  From 
the  perspective  of  a  network  operator  or  service  provider,  it 
is  important  to  guarantee  satisfactory  and  profitable  service 
coverage  while  minimizing  the  deployment  and  maintenance 
costs  of  the  roadside  infrastructure. 

To  improve  user  satisfaction  and  service  coverage  of  fu¬ 
ture  IEEE  1609  based  WAVE  systems  and  applications,  this 
research  develops  an  analytical  model  to  fully  characterize 
the  access  probability  (for  user  satisfaction  analysis)  and  the 
connectivity  probability  (for  service  coverage  analysis)  for 
infrastructure-based  vehicular  relay  networks,  wherein  both 
one-hop  (direct  access)  and  two-hop  (via  a  relay)  commu¬ 
nications  between  a  vehicle  and  the  infrastructure  (i.e.  a  BS) 
are  supported.  In  this  paper,  a  generic  connection  model  is 
used  to  investigate  the  impact  of  different  system  parameters, 
i.e.,  inter-BS  distance  (or  BS  density),  vehicle  density,  radio 
coverage  ranges  of  BSs  and  vehicles,  on  key  performance 
metrics,  i.e.  user  access  probability  and  service  connectivity 
probability.  The  analysis  is  then  applied  to  two  widely  used 
communication  channel  models  as  specific  examples  of  the 
generic  connection  model.  This  research  enables  us  to  improve 
access  probability  and  connectivity  probability  in  vehicular 
relay  networks,  and  therefore  support  reliable  V2I  and  V2V 
data  transmissions  in  different  commercial  applications  and 
services,  such  as  emergency  messaging  service,  mobile  Inter¬ 
net  access  and  on-road  entertainments. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section  II 
we  introduce  related  work.  In  Section  III  we  define  the  system 
model.  In  Section  IV  we  present  the  analysis  of  the  access  and 
connectivity  probabilities  under  a  generic  radio  channel  model. 
In  Section  V  we  focus  on  two  widely  used  radio  channel 
models,  i.e.  the  unit  disk  communication  model  and  the  log¬ 
normal  shadowing  model,  and  their  analysis  as  special  cases 
of  the  generic  channel  model.  In  Section  VI  we  discuss  the 


analytical  and  simulation  results,  followed  by  conclusions  in 
Section  VII. 

II.  Related  Work 

Recently,  significant  research  on  VANET,  WAVE,  IEEE 
802. lip  and  1609  standards  has  been  undertaken  to  measure, 
estimate  and  characterize  wireless  vehicular  channels  [6],  [7], 
[8],  to  model  and  analyze  system  performance  [9],  [10],  [11], 
[12],  [13],  [14],  to  design  and  evaluate  MAC  protocols  [15], 
[16],  [17],  [18],  [19],  and  to  develop  VANET  simulator  [20] 
and  testbed  [21].  Specifically,  in  [9],  it  is  found  a  communica¬ 
tion  distance  of  1000m,  which  is  specified  in  the  IEEE802.1  lp 
Project  Authorization  Request  (PAR),  cannot  be  achieved  by 
an  Equivalent  Isotropically  Radiated  Power  (EIRP)  of  2W  in 
a  vehicle-to-vehicle  (V2V)  highway  scenario.  The  impacts  of 
vehicle  density  (or  V2V  distance)  and  Line-Of-Sight  (LOS) 
communication  link  on  different  system  performance  metrics, 
such  as  throughput,  average  delay,  packet  loss,  and  collision 
probability,  are  investigated  in  [10],  [11],  [12]. 

In  [15],  it  is  pointed  out  that  the  performance  of  IEEE 
802. lip  standard  is  not  satisfactory  in  the  infrastructure  data 
collection  mode  with  a  static  backoff  scheme.  The  capability 
of  802. lip  MAC  protocol  is  further  evaluated  and  enhanced 
to  support  both  safety  applications  (i.e.  emergency  message 
dissemination  with  strict  time  constraints)  and  non-safety 
applications  [16],  [17],  [22],  [18],  [19],  In  [23],  Salhi  et  al. 
presented  a  novel  data  gathering  and  dissemination  architec¬ 
ture  based  on  hierarchical  and  geographical  mechanisms  for 
vehicular  sensor  networks.  In  [24]  and  [25]  practical  traffic 
prioritization  and  power  control  schemes  are  proposed  and 
evaluated  respectively,  to  support  real-time  delivery  of  safety- 
critical  emergency  information.  In  [26],  Shrestha  et  al.  develop 
a  new  scheme  using  the  BitTorrent  tool  and  bargaining  game 
to  efficiently  distribute  a  large  amount  of  data  over  V2V  and 
V2I  communication  links. 

Access  and  connectivity  probabilities  have  been  studied 
in  the  literature  for  one-dimensional  (1-D)  [27],  [28]  and 
two-dimensional  (2-D)  [29],  [30],  [31]  multi-hop  wireless 
networks.  In  [27],  Wu  focuses  on  V2V  communications  and 
derives  a  close-form  expression  of  connectivity  probability  in 
a  linear  VANET  with  high-speed  vehicles  and  time-varying 
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vehicle  populations,  i.e.  dynamic  network  topology  and  ve¬ 
hicle  density.  The  impacts  of  some  key  network  parame¬ 
ters,  such  as  vehicle  arrival  rate,  random  vehicle  speed  and 
transmission  range,  are  considered  in  his  work.  Based  on  a 
Poisson  assumption  of  node  distribution,  Dousse,  Thiran  and 
Hasler  study  a  1-D  network  with  equally-spaced  BSs  and 
Poissonly  distributed  vehicles  in  [28].  Considering  a  unit  disk 
communication  model,  they  derive  the  connectivity  probability 
(defined  as  access  probability  in  this  paper)  that  an  arbitrary 
vehicle  can  reach  at  least  one  BS  over  multiple  hops. 

For  a  2-D  multi-hop  cellular  network,  where  nodes  are 
uniformly  distributed  in  a  circular  area  of  unit  radius,  Ojha 
et  al.  obtain  the  minimum  transmission  range  required  for 
these  nodes  to  be  able  to  access  a  BS  located  at  the  center 
of  this  circular  area  over  multiple  hops  under  the  unit  disk 
communication  model  as  the  number  of  nodes  goes  to  infinity 
[29].  When  both  BSs  and  nodes  are  Poissonly  distributed  in  a 
2-D  area  and  a  log-normal  shadowing  communication  model 
is  considered,  a  lower  bound  on  the  probability  that  a  node 
cannot  access  any  BS  within  a  designated  number  of  hops  has 
been  derived  in  [30].  The  result  relies  on  the  independence 
assumption  that  the  event  that  a  node  cannot  access  to  a 
BS  in  a  specific  number  of  hops,  say  t,  and  the  event  that 
another  node  cannot  access  to  a  BS  in  t  hops  are  independent, 
which  is  not  always  valid  for  some  cases  (to  be  shown  in 
this  paper).  In  [31],  the  probability  that  a  wireless  ad-hoc 
network  with  randomly  and  uniformly  distributed  nodes  form 
a  connected  network  is  studied.  It  is  shown  that  the  probability 
of  having  a  connected  network  and  the  probability  of  having 
no  isolated  node  asymptotically  converges  to  the  same  value 
as  the  number  of  nodes  in  the  network  goes  to  infinity. 

Different  from  previous  work  carried  out  mainly  under  the 
unit  disk  communication  model  and  considering  no  restriction 
on  the  maximum  hop  count  for  packet  transmission,  this 
research  focuses  on  vehicular  relay  networks  with  a  maximum 
hop  count  of  two  hops  to  ensure  high  communication  quality 
and  reliability  between  a  node  and  a  BS,  i.e.  two-hop  V2I  com¬ 
munications,  which  is  more  practical  for  real-world  VANET 
application  scenarios.  In  addition,  our  analytical  approach  uses 
a  generic  communication  channel  model  and  derives  the  exact 
close-form  equations  of  access  and  connectivity  probabilities, 
not  the  asymptotic  results  that  are  valid  only  when  the  number 
of  nodes  (or  node  density)  in  a  network  is  very  large.  The 
results  obtained  under  the  generic  channel  model  are  then 
applied  to  two  widely  used  models,  i.e.  the  unit  disk  model  and 
the  log-normal  model,  as  special  cases.  Finally,  we  investigate 
the  impacts  of  some  key  system  parameters,  such  as  inter- 
BS  distance  (or  BS  density),  vehicle  density,  transmission 
ranges  of  a  BS  and  a  vehicle,  on  user  access  probability  and 
service  connectivity  probability  performance  under  different 
communication  channel  models. 

III.  System  Model 

We  consider  an  infrastructure-based  vehicular  relay  net¬ 
work,  as  shown  in  Fig.  2,  wherein  a  number  of  BSs  are 
uniformly  deployed  along  a  long  road,  while  other  vehicles 
are  distributed  on  the  road  randomly  according  to  a  Poisson 
distribution.  We  analyze  the  access  probability,  i.e.  the  proba¬ 
bility  that  an  arbitrary  vehicle  can  access  its  nearby  BSs  within 


Fig.  2.  An  Infrastructure-based  Vehicular  Relay  Network. 


two  hops,  and  the  connectivity  probability,  i.e.  the  probability 
that  all  vehicles  can  access  at  least  one  BS  within  two  hops, 
of  the  network  by  investigating  a  subnetwork  bounded  by  two 
adjacent  base  stations.  Let  L  be  the  Euclidean  distance  (in 
meters)  between  two  adjacent  BSs  and  p  be  the  vehicle  density 
measured  in  vehicles  per  meter  (vpm).  Since  the  vehicles  are 
assumed  to  be  Poissonly  distributed  with  density  p,  discussion 
on  the  distribution  of  the  number  of  vehicles  on  the  road  is 
only  meaningful  if  we  restrict  to  the  (random)  number  of 
vehicles  in  a  specific  section  of  the  road,  and  we  call  any 
section  of  the  road  a  road  segment.  For  a  road  segment  with 
length  x,  the  number  of  vehicles  on  that  road  segment  is  then 
a  Poisson  random  variable  with  mean  px.  So  the  probability 
that  there  are  k  vehicles  on  a  road  segment  of  x  meters  is 
given  as 


f{k,x) 


(px)ke  px 
k\ 


,  k  >  0. 


(1) 


Since  we  investigate  a  subnetwork  bounded  by  two  adjacent 
BSs,  the  probability  that  there  are  k  vehicles  on  the  road 
segment  bounded  by  two  adjacent  BSs  is  then  f(k,  L). 

Assuming  a  generic  channel  model  C,  let  9v  0)  be  the 
probability  that  two  vehicles  separated  by  an  Euclidean  dis¬ 
tance  x  are  directly  connected.  Similarly,  denote  by  g %  (x)  the 
probability  that  a  vehicle  and  a  BS  separated  by  an  Euclidean 
distance  x  are  directly  connected.  We  assume  that  the  event 
that  two  vehicles  (or  a  vehicle  and  a  BS)  are  directly  connected 
is  independent  of  the  event  that  another  two  vehicles  (or  a 
vehicle  and  a  BS)  are  directly  connected.  That  is,  the  event  that 
two  vehicles  (or  in  the  similar  case,  between  a  vehicle  and  a 
BS)  are  directly  connected  is  only  determined  by  the  locations 
of  the  two  vehicles  and  is  not  affected  by  the  presence  or 
absence  of  connections  between  other  pairs  of  vehicles1.  We 
also  assume  that  g£(x)  >  g%(x).  This  assumption  is  justified 
because  it  is  often  the  case  that  a  BS  can  not  only  transmit 
at  a  larger  transmission  power  than  a  vehicle,  it  can  also  be 
equipped  with  more  sophisticated  antennas,  which  make  it 
more  sensitive  to  the  transmitted  signal  from  a  vehicle. 


IV.  Analysis  of  Access  and  Connectivity 
Probabilities 

Assume  that  the  subnetwork  being  considered  is  placed  at 
[0,L].  The  two  BSs  at  both  ends  of  the  subnetwork  are  labeled 

1  Although  held  measurements  in  real  applications  seem  to  indicate  that  the 
connectivity  between  different  pairs  of  geographically  /  frequency  proximate 
wireless  nodes  are  correlated  [32],  [33],  [34],  the  independence  assumption 
is  generally  considered  appropriate  for  far-held  transmission  and  has  been 
widely  used  in  the  literature  under  many  channel  models  including  log¬ 
normal  shadowing  model  [35],  [36],  [30],  [37].  Note  that  the  unit  disk  model 
is  a  special  channel  model  which  fulfills  the  independence  assumption  by 
nature  [38,  pg.  12].  This  is  because  for  the  unit  disk  model,  two  vehicles  are 
directly  connected  if  and  only  if  their  Euclidean  distance  is  smaller  than  the 
transmission  range,  and  is  not  affected  by  the  presence  or  absence  of  other 
connections  (vehicles). 
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as  BS1  and  BS2  and  are  at  0  and  L  respectively.  Denote  by 
G(L,p,C)  the  subnetwork  with  length  L,  vehicle  density  p 
and  channel  model  C.  We  investigate  the  access  probability 
pa  that  an  arbitrary  vehicle  in  G(L,  p,C )  can  access  either  BS 
(either  BS1  or  BS2).  We  also  investigate  the  probability  pc 
that  all  vehicles  in  G(L,p,C)  are  connected  to  at  least  one  of 
the  BSs  at  both  ends  of  the  subnetwork. 

A  vehicle  is  said  to  be  located  at  x  if  its  Euclidean  distance 
to  BS1  is  x.  The  probability  that  a  vehicle  located  at  x  is 
not  directly  connected  to  BS1  and  BS2  are  1  —  gffx)  and 
1  —  gC(L  —  x)  respectively.  Because  the  event  that  a  vehicle 
is  not  directly  connected  to  BS1  and  the  event  that  the  same 
vehicle  is  not  directly  connected  to  BS2  are  independent,  the 
probability  that  the  vehicle  is  directly  connected  to  either  BS  1 
or  BS2  is  then 

Pi(x)  gb{x))(l- g$(L-x)).  (2) 


In  order  to  derive  pa  and  pc  we  need  the  following  lemmas. 


Lemma  1.  Let  K\  be  the  set  of  vehicles  in  the  subnetwork 
G{L,  p,C)  which  are  directly  connected  to  either  BS1  or 
BS2,  then  K\  has  an  inhomogeneous  Poisson  distribution  with 
density  ppi{x)  where  pi(x)  is  given  by  Eq.  (2). 

Proof:  Let  K  denotes  the  set  of  vehicles  in  G(A,p,C). 
Then  K  has  a  homogeneous  Poisson  distribution  with  density 
p  over  the  segment  [0 ,  L] .  Consider  a  realization  of  K  and 
remove  a  vehicle  located  at  x  from  this  realization  with 
probability  1  —  pi[x),  independent  of  the  removal  probability 
of  other  vehicles.  The  remaining  set  of  vehicles  can  be 
effectively  viewed  as  a  realization  of  R'i .  Note  that  the  above 
procedure  which  removes/retains  vehicles  independently  with 
some  probabilities  is  called  thinning  [38].  Let  be  the 

number  of  vehicles  in  K\.  Then  following  the  law  of  total 
probability 


Pr(A/'(A"1)  =  j) 

OO 

=  Pr (M'(K)  =  i)  Pr(W(Tfi)  =  j\ff(K)  =  i).  (3) 

i-j 


For  a  randomly  chosen  vehicle  in  K ,  the  vehicle  is  known 
to  be  uniformly  distributed  in  [0,1/].  Hence,  the  probability 
that  a  vehicle  is  in  K  \  given  that  the  vehicle  is  in  AT  is 


1 


q  = 


Pi(x)dx. 


(4) 


Since  the  probability  of  a  randomly  chosen  vehicle  in  K 
being  directly  connected  to  either  BS1  or  BS2  are  identically 
and  independently  distributed,  the  probability  that  among  i 
vehicles  in  K  there  are  j  vehicles  in  K  \  follows  the  binomial 
distributed  B(i,q).  Thus 


Pr (W'(iTr)  =  j\ftf(K)  =  i)  =  (^(1  -  qT~j ■  (5) 

Applying  Eq.  (1)  and  (5)  into  Eq.  (3)  we  have 


Pr(A/'(A'i)  =j)  =  J2  (*  W  -  «)W 

i=j  '  'J' 

_  (Jo  PPi(x)dx)i  pPl(x)dx 

j! 


Furthermore,  denote  by  Af(Ki(l))  the  number  of  vehicles  in 
a  road  segment  l  within  [0,  L]  which  are  directly  connected 
to  at  least  one  BSs.  Using  the  above  procedure,  it  is  trivial  to 
show  that 

Pr(A/XAM0)  =  j)  =  (7) 

3- 

For  n  mutually  disjoint  road  segments  li,  I2,  ■  ■  ■  ,ln  in  [0,  L], 
the  random  variables  ••  •  ,Af(K\(ln))  are  mutu¬ 

ally  independent.  This  is  because  the  event  that  one  vehicle  is 
directly  connected  to  either  BS1  or  BS2  is  not  affected  by  the 
locations  of  other  vehicles,  and  whether  or  not  those  vehicles 
are  directly  connected  to  either  BS1  or  BS2.  Consequently, 
the  existence  and  locations  of  vehicles  in  one  road  segment 
will  not  affect  the  number  of  vehicles  directly  connected  to 
BS1  or  BS2  in  another  disjoint  road  segment.  With  the  above 
independence  property,  Eq.  (6)  and  (7),  the  proof  is  then 
complete.  Note  that  some  parts  of  the  proof  are  similar  to 
the  arguments  used  in  [38].  ■ 

Lemma  2.  Let  p2{x)  be  the  probability  that  a  vehicle  located 
at  x  in  G{L ,  p,  C)  is  directly  connected  to  at  least  one  vehicle 
in  K\,  then 

P2(X)  =  1  -  e~  So  ^-v9)PPilv)dy  (g) 

where  p\(y)  is  given  by  Eq.  (2)  and  ||.||  denotes  the  Euclidean 
norm. 

Proof:  Imagine  we  partition  [0,A]  into  L/dy  non¬ 
overlapping  intervals  of  differential  length  dy.  Since  dy  is  a 
very  small  value,  the  probability  that  there  exist  more  than  one 
vehicle  within  each  interval  of  length  dy  can  be  ignored  and 
the  probability  that  there  exists  exactly  one  vehicle  within  dy 
is  pdy.  The  probability  that  there  exists  a  vehicle  in  [y,  y  +  dy] 
which  is  also  in  K\  is  then  given  by  ppi(y)dy.  Note  that 
the  vehicles  at  x  and  y  are  directly  connected  to  each  other 
with  probability  (j^iWx:  —  y||).  Therefore,  the  probability  that 
a  vehicle  at  x  is  directly  connected  to  a  vehicle  in  K\  and  is 
located  in  [y,y  +  dy]  is  g^{\\x  -  y\\)ppi{y)dy. 

Let  h(x,y)  denotes  the  probability  that  the  vehicle  at  x  is 
not  directly  connected  to  any  of  the  vehicles  in  K\  located 
within  [0,  y].  Because  the  events  that  distinct  pairs  of  vehicles 
are  directly  connected  are  independent,  the  event  that  the 
vehicle  at  x  is  not  directly  connected  to  any  of  the  vehicles  in 
Ki  located  within  [0,  y]  is  independent  of  the  event  that  the 
same  vehicle  is  not  directly  connected  to  the  vehicle  in  Ki 
located  within  [y,  y  +  dy]  (if  there  is  any).  We  have 

h(x,  y  +  dy)  =  h(x,  y)(l  -  g%(\\x  -  y\\)ppi{y)dy)  (9) 

where  the  second  term  on  the  right  hand  side  of  the  equation  is 
the  complement  of  the  probability  that  a  vehicle  at  x  is  directly 
connected  to  a  vehicle  in  K\  and  located  in  [y,  y  +  dy].  Eq.  (9) 
leads  to 

dh(x,  y)  =  -h{x,y)gC{\\x  -  y\\)pp\{y)dy.  (10) 

Therefore  the  probability  that  a  vehicle  at  x  is  not  directly 
connected  to  any  vehicle  in  Afi  is 

—  e~  Jo  3v(\\x-y\\)ppi(y)dy 


(6) 


(11) 
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The  result  follows  immediately.  ■ 

The  following  two  theorems  give  the  access  probability  pa 
and  the  connectivity  probability  pc  respectively. 

Theorem  1.  Denote  by  pa(x)  the  access  probability  of  a 
vehicle  at  x,  i.e.  the  probability  that  the  vehicle  at  x  is 
connected  to  either  BS1  or  BS2  in  at  most  two  hops.  Then 

pa(x)  =  1  -  (1  -  pi(x))(l  -  p2(x))  (12) 

where  pi(x)  is  given  by  Eq.  (2)  and P2(x)  is  given  by  Eq.  (8). 

Proof:  The  result  follows  immediately  from  the  observa¬ 
tion  that  the  event  that  a  vehicle  at  x  is  directly  connected  to 
either  BS1  or  BS2  is  independent  of  the  event  that  the  same 
vehicle  is  directly  connected  to  at  least  one  vehicle  in  K\ .  ■ 

Theorem  2  (Approximate  result).  Denote  by  pc  the  connec¬ 
tivity  probability  of  G(L,  p,C),  i.e.  the  probability  that  all 
vehicles  in  the  subnetwork  G(L,p,C)  are  connected  to  either 
BS1  or  BS2  in  at  most  two  hops.  Assume  that  the  event  that  a 
vehicle  is  connected  to  either  BS1  or  BS2  in  at  most  two  hops 
is  independent  of  the  event  that  another  vehicle  is  connected 
to  either  BS1  or  BS2  in  at  most  two  hops.  Then 

-  f  L  p(l—  pa{x))dx  /n\ 

pc  =  e  Jo  (13) 

where  pa(x)  is  given  by  Eq.  (12). 

Proof:  Let  K2  be  the  set  of  vehicles  in  G(L,p,C) 
which  are  connected  to  either  BS1  or  BS2  in  exactly  two 
hops.  Together  with  the  definition  of  K\  in  Lemma  1,  let 
K\  +  K2  =  K\(Ki  +  K2)  be  the  set  of  vehicles  in  G(L,  p,C) 
which  are  not  connected  to  either  BS1  or  BS2  in  at  most 
two  hops.  Apply  the  thinning  procedure  for  K,  i.e.  consider 
a  realization  of  K  and  remove  each  vehicle  located  at  x 
independently  from  this  realization,  with  probability  pa{x). 
The  resulting  set  of  vehicles  can  be  viewed  as  a  realization  of 
K[  +  K2  under  our  assumption  that  the  event  that  one  vehicle 
is  connected  to  either  BS  in  two  hops  is  independent  of  the 
event  that  another  vehicle  is  connected  to  either  BS  in  two 
hops,  and  the  probability  that  vehicle  at  x  is  connected  to 
either  BS  in  two  hops  is  pa{x).  Using  the  same  technique  as 
that  used  in  the  proof  of  Lemma  1,  it  can  be  readily  shown 
that  K 1  +  K2  has  an  inhomogeneous  Poisson  distribution 
with  density  p(  1  —  pa(x)).  Then  all  vehicles  G(L,p,C)  are 
connected  to  either  BS1  or  BS2  in  at  most  two  hops  if  and 
only  if  A/" ( K[  +  K2)  =  0.  The  result  follows.  ■ 

Note  that  Theorem  2  only  gives  an  approximate  result 
for  the  connectivity  probability  because  of  the  independence 
assumption.  The  following  lemma  proves,  in  a  way,  that  the 
event  that  a  vehicle  is  connected  to  either  BS1  or  BS2  in  at 
most  two  hops  is  not  independent  of  the  event  that  another 
vehicle  is  connected  to  either  BS1  or  BS2  in  at  most  two 
hops. 

Lemma  3.  Let  h{x)  =  1  —  P2(x)  be  the  probability  that  a 
vehicle  at  x  is  not  directly  connected  to  any  vehicle  in  K  \ ; 
let  h(xi,x2)  be  the  probability  that  two  vehicles,  at  x\  and 
x2  respectively,  are  not  directly  connected  to  any  vehicle  in 
K\.  Then,  h(x i,x2)  >  h(x\)h(x2). 

Proof:  Let  h(xi,x2\y)  denotes  the  probability  that  two 
vehicles,  at  x\  and  x2  respectively,  are  not  directly  connected 


to  any  vehicle  in  K\  located  in  [0,  y\.  Using  the  similar 
argument  in  Eq.  (9),  we  have 

h(x!,x2-,y  +  dy)  =  h(xi,x2-,y)k(x1,x2;y)  (14) 

where  k(x1:x2;y)  =  (1  -  9y{\\xi  -  y||))(l  -  s«( \\x2  - 
y\\))ppi(y)dy  +  (1  —  ppi(y)dy).  The  first  term  on  the  right 
hand  side  of  k(xi,x2\y)  is  the  probability  there  is  a  vehicle 
in  K\  located  in  [y,  y  +  dy]  and  both  vehicles  in  x±  and  x2 
are  not  directly  connected  to  it.  The  second  term  on  the  right 
hand  side  of  k(xi,x2\y)  is  the  probability  that  there  is  no 
vehicle  in  Ki  located  in  [y,y  +  dy].  Expanding  the  right  hand 
side  of  k(xi,x2;y)  we  have 

k(xi,x2;  y)  =  1  -  g^{\\xi  -  y\\)ppi{y)dy 
~  9v( \\X2  -  y\\)ppi(y)dy 
+  9v(  ll*i  -  ylDs^GI^  -  y\\)ppi(y)dy 

Using  the  same  approach  in  Lemma  2  we  obtain 

h(xi  x2)  =  e~  f^i9v^Xl-v^+9v( \\x*-v\\)]pPi(v)dy 

X  e-fo  9v(\\xi-y\\)9v(\\x2-y\\)ppi{y)dy 

>  e~ So  [sv(.\\xi--y\\)+9v(\\x2-y\\)]ppAy)dy 

=  h{x\)h(x2)  (from  Eq.  (11)) 


Before  obtaining  the  exact  result  of  the  connectivity  proba¬ 
bility,  we  introduce  some  properties  in  the  following  lemma. 


Lemma  4.  Let  pc{ y)  be  the  connectivity  probability  of 
G(L,p,C)  conditioned  on  that  the  number  of  vehicles  di¬ 
rectly  connected  to  either  BS  is  n  and  they  are  located  at 
y  =  {2/1, 2/2,  •  -  -  ,yn  ■  0  <  yi  <  L,  1  <  i  <  n};  let  pY( y) 

be  the  probability  density  function  (pdf)  of  y  conditioned  on 
that  there  are  n  vehicles  directly  connected  to  either  BS.  The 
following  properties  hold. 


(i) 

(H) 


My)  =  H 


Pi(Vi) 


Pc(y) 


*= 1  fo  Pi(x)dx 
_  ~  fgL  P(l-PlU))  i/dl))^ 


(15) 

(16) 


Proof:  For  n  =  1,  py(y  1)  =  —  is  the  probability 

pi(x)dx 

that  a  vehicle  in  I\\  is  located  at  y\.  Since  pi(yf)  and  pi(yf) 
are  mutually  independent  for  i  f  j,  the  result  follows  for 
Eq.  (15). 

For  Eq.  (16),  note  that  a  vehicle  at  x  is  not  connected  to 
any  BSs  in  at  most  two  hops  if  it  is  not  directly  connected  to 
any  BSs  (the  probability  is  1  —  p\(x))  and  it  is  not  directly 
connected  to  vehicles  which  are  located  at  y  given  that  these 
vehicles  are  in  K\  (the  probability  is  1  -  9v(\\x  -  yi\\)  for 
1  <  i  <  n).  That  is,  vehicle  at  x  cannot  access  any  BS  in  at 
most  two  hops  with  probability 


(1  -  pi  (a;))  J|(l  -  M  Ik  -  2/ill))-  (17) 

i=l 


Eq.  (17)  is  valid  when  x  ^  y.  When  x  =  y3  for  arbitrary  j, 
we  assume  that  <?„(0)  =  1.  This  implies  that  pa(a;|y)  =  1. 
So  Eq.  (17)  is  still  valid  when  x  £  y.  Applying  the  thinning 
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procedure  and  the  technique  used  in  Lemma  1,  we  have  the 
number  of  vehicles  which  are  neither  directly  connected  to 
any  BSs  nor  directly  connected  to  any  of  the  vehicles  at  y 
is  an  inhomogeneous  Poisson  random  variable  with  density 

p{  1  ^  Pi(®))nr=i(1  “  9v(\\x  ~  2/ill))-  The  result  follows 
immediately.  ■ 


Theorem  3  (Exact  result).  Denote  by  pc  the  connectivity 
probability  of  G{L ,  p,  C),  i.e.  the  probability  that  all  vehicle 
in  the  subnetwork  G(L,p,C)  is  connected  to  either  BS1  or 
BS2  in  at  most  two  hops.  Then 


Pc=J2  PrCA/TO 

n— 0 


n) 


pc{y)pY{y)dy 


(18) 


where  pc( y)  and  py{ y)  are  given  by  Lemma  4;  Pr(A/”(/Li) 
■  is  given  by  Lemma  1 .  When  n  =  0,  we  declare 


/  Pc(y)PY(y)dy 

=  pc{y)pY{y) 

P0,L]n 

n— 0 

_  -  fgL  p(l-pi(x))dx 


(II)  For  2R<  L  <2R  +  r, 

we  have  pi(x)  is  0  when  x  £  ( R,L  —  R),  and  1 
otherwise.  When  x  £  ( R ,  L  —  R),  Eq.  (8)  becomes 

p2(x)  =  1  -  e~  C  9^llx-Mpdv-JLn^Wx-Mpdv 

-  JR  pdy-G+r  pdy 
=  l_  e-p(2R+2r-L) 


So  substitute  P2(x)  into  Eq.  (12), 


2  R  L  -  2  R 


Pa  = 


(1 


L  L 

_  1  _  L  ~  2R  c-p(2R+2r-L) 


_  e-p(2R+2r-L)j 


(III)  For  2R  +  r  <  L  <  2R  +  2 r, 

we  have  for  x  £  (f?,  L  —  R),  Eq.  (8)  becomes 

(  \  1  -/nHs»(lk-!/ll)pdy-/.1'  ai{\\x-v\\)pdy 

P2\x)  =  1  —  e  Jo  JL-R 

=  i  _e-/~ 


Proof:  Eq.  (18)  directly  follows  from  the  law  of  total 
probability,  so  the  details  are  omitted  here.  ■ 

Eq.  (18)  gives  an  exact  formula  for  the  connectivity  proba¬ 
bility  which  does  not  rely  on  the  assumption  that  the  event 
that  a  vehicle  is  connected  to  either  BS  in  two  hops  and 
the  event  that  another  vehicle  is  connected  to  either  BS  in 
two  hops  are  independent.  However  Eq.  (18)  is  much  more 
complicated  than  the  approximate  result  in  Eq.  (13).  In  many 
situations,  Eq.  (13)  provides  a  reasonably  accurate  result  for 
the  connectivity  probability.  Therefore  we  include  both  results 
in  this  paper. 

V.  Performance  Evaluation  under  Specific 
Channel  Models 

Based  on  the  analysis  in  Section  IV,  we  further  derive  and 
compare  in  this  section  the  access  probability  and  connectivity 
probability  performance  under  two  specific  channel  models, 
i.e.  unit  disk  model  and  log-normal  shadowing  model. 

A.  Unit  Disk  Model 

In  the  unit  disk  model  U,  assume  that  two  vehicles  are 
directly  connected  if  and  only  if  their  Euclidean  distance  is 
less  than  or  equal  to  r;  assume  that  a  vehicle  and  a  BS  are 
directly  connected  if  and  only  if  their  Euclidean  distance  is 
not  more  than  R.  In  other  words, 

u ,  .  f  1  if  x  <  r  u,  \  f  1  if  x  <  R 

9v(x)={  ~  9b(x)={ 

|0  otherwise,  |0  otherwise. 

where  r  and  R  are  predetermined  values,  commonly  known  as 
the  transmission  ranges.  Typically  we  have  R  >  r.  Applying 
the  above  equations  into  Eq.  (2),  (8)  and  (12)  we  obtain  the 
access  probability  under  the  unit  disk  model  U: 

(I)  For  0  <  L  <  2 R, 

we  have  pi(x)  =  1  implies  that  pa(x)  =  1  for  x  £ 
[0,L].  Hence, 

Pa  =  1- 


So  substitute  P2(x)  into  Eq.  (12), 


Pa  = 


IR 


2  R  1 
~L  +L 
1  fR+r 

vl\ 

^  JL-R-r 
L-R 


f-L—R—r 


(1  _  e-p(R+r~x))dx 
(1  _  e-p{2R+2r~L))dx 


1  f^n 
L  (1~6 

^  J  R+r 


-p{R+r+x-L)^dx 


=  1  +  —  (e~pr  -  e-P(2«+2 r-Lh 
pL 


2R  +  2r-L 


-p(2R+2r-L) 


(IV)  For  L  >  2R+  2 r. 
From  Eq.  (8) 


P2(x) 


l  —  e  fx-rpdy  =  l  —  e-p(R+r-x) 

when  x  £  (R,  R  +  r] , 

rx+r 

'  1  _  e-JL-Rpdv  =  1  _  e-p(R+r+x-L) 

when  x  £  [L  —  R  —  r,L  —  R), 
_  0  when  x  £  (R  +  r,L  —  R—  r). 


So  substituting  P2(x)  into  Eq.  (12), 

Pa  =  t  +  i  C+r{l  ~  e~p{R+r~x))dx 

+  y  f  (l  _  e-p{R+r+x~L))dx 
L  Jl-r-t 

2R  +  2r  i  2(e-^-l) 

“  L  +  ~pL  ' 

To  derive  the  equations  for  the  connectivity  probability 
(exact  result),  we  first  look  at  Lemma  4.  For  the  unit  disk 
model,  pi(x)  is  1  when  x  £  [0,  R]  U  [L  —  R,L\  and  zero 
otherwise.  Hence,  Eq.  (15)  becomes 


Py(  y) 


1 

(min(2J?,  L))n 


(19) 
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when  yi  E  [0,  R]\J[L— R,  L],  V^/i,  and  zero  otherwise.  Eq.  (16) 
becomes 


pc (y)  =  e~-fn 


Vb  = 


f  0  if  =  0 

1  max{y  :  y  €  Sa}  otherwise, 

[L  if  Sb  =  0 

min {y  :  y  G  Sb}  otherwise. 


fa(ya',  na)  =  <  R^  R 


Similarly  we  have  the  pdf  of  yb  as 


if  n„  =  0. 


fb{yb\nb)  = 


J1 if»6>l 


8{L-yb)  if  nb  =  0. 


With  ya  and  yb,  we  can  rewrite  Eq.  (20)  into 


_  rmin{L-R,yb-r}  ^ 

Pc(ya,Vb )  =  e  J™*{R,ya+r}  P  X 


and  Eq.  (18)  can  be  transformed  into 


for  L  >  2 R.  Eq.  (24)  can  be  further  simplified  under  different 
cases.  For  na>  0  and  nb  >  0,  Eq.  (24)  becomes 


(20) 


P, 


(na>0,nt,>0) 


f'R  pL 


pc(ya,yb ) 


Note  that  ITLiO  “  9vi\\x  ~  Vi  ID)  is  1  when  il^  “  Vi\\  >  r 
for  all  yi.  For  L  <  2 R,  we  can  easily  obtain  pc  from  Eq.  (18) 
by  substituting  Eq.  (19)  and  (20)  into  it  (will  be  shown  later). 
To  obtain  the  result  for  L  >  2 R,  the  following  transformation 
will  simplify  the  arithmetic  work. 

Let  Sa  (and  Sb)  be  the  set  of  vehicles  in  [0,  R]  (and  [L  — 
R,L\)  which,  by  definition,  are  also  in  K\.  Let  Af(Sa)  (and 
A f(Sb))  be  the  number  of  vehicles  in  Sa  (and  Sb).  Note  that 
S a U Sb  =  Ki  and  Af(Sa)+Af(Sb)  =  Af(Ki).  Let  ya  (and  yb) 
be  the  location  of  the  vehicle  in  Sa  (and  Sb )  which  is  furthest 
from  BS1  (and  BS2).  That  is, 


/  0  JL-R 


oo  oo 


E  E  Pr(V(5„)  =  na) 


|_na— 1  rib— 1 

Pr(A f(Sb)  =  nb)fa{ya ;  na)fb(yb ;  nb)]  dybdya 

"R  rL  l  -  -  _pR 


Pc(ya,yb ) 


/0  JL-R 


oo  oo 

E  E 

na= 1  rib—1 


na'- 


(. PR)nb  „-pR  na ,  Va  nb,L-yb 


nb\ 

pR  pL 


/  tfa  \na  —  1  luo  _ yo\r 

r[  r  ’ 


Pc{ya,yb)p  e 


2e-2pR 


>0  JL-R 

oo  ,  \na-l  1  1  00 


E 


(PVaY 


E 


0(L  -  2/&)) 


nb  — 1 


dybdya 


dybdya 


(21) 

(22) 


(»«-!)!  J 

=  [  f  Pc{ya,yb)p2e~2pRePVaep{L~Vb)dybdya.  (25) 
Jo  JL-R 

For  na  =  0  and  nb  >  0,  Eq.  (24)  becomes 

(na=0,nb>0) 


P, 


Therefore,  the  cumulative  probability  function  of  ya  is 


Pr(t/o  <  Vmax  )  =  Pr (Vi  <  'Umax')  Vj/i  G  Sa) 

=  (^)"°  for  n0  =  J\f(Sa)  >  1. 

K 

With  Eq.  (21)  defines  ya  =  0  when  Af(Sa)  =  0,  we  have  the 
pdf  of  ya  as 


1  ifn„>l 


=E 

nb  — 1 

pL 


e-pR(pR)nb  c-pR 

nb\ 


[  Pc(0,t/b)^(L  Vb)nt  1dyb 
JL-R  R  R 


[l  •£  ml- ,>)pr  •*«'■ 

Jl-r  (nb~1)! 


Pc(0,  yb)pe~2pRep(-L~yb^  dyb. 


(26) 


!  L—R 


With  similar  steps  (omit  here)  we  can  obtain  for  na  >  0  and 
nb  =  0,  Eq.  (24)  becomes 


p(na>0,nt=0)  =  f  Pc(ya,L)pe-2PRePy*dya.  (27) 
Jo 

Note  that  it  can  be  shown  that  Eq.  (27)  equals  to  (26)  by 
letting  yb  =  L  —  ya,  then 

J  pc{L  -  yblL)pe~2pRep{L~Vb)dyb 


(na>0,rib—0)  _ 


I  L—R 


Pc(0,  yb) pe~2pR ep^L~Vb> d, 


Pb 


where  pc(L  -  yb,L)  =  pc{0,yb).  Finally  for  na  =  0  and 
nb  =  0, 


(23) 


P, 


(n.=0,n»=0)  =  e-PRe-PRPc{Qi  L)  =  e~2pRe~  Sr  “  pdx 


—  e~2pRe-p{L-2R)  _  e~pL 


(28) 


OO  OO 

Pc=E  E  Pr(-V(6'a)  =  na)  Pr(A/'(5i))  =  n6) 

na— 0  nb— 0 
/»L 

/  /  Pc(ya,yb)fa(ya;na)fb(yb;nb)dybdya 

J0  JL-it 

(24) 


Using  Eq.  (25),  (26),  (27)  and  (28),  we  can  obtain  the 
connectivity  probability  as  follows.  Due  to  the  lengthy  (but 
straightforward)  steps  involved  to  derive  the  results,  we  omit 
the  intermediate  steps  and  only  include  the  results  of  Eq.  (25) 
and  (26)  for  readers’  convenience. 

(I)  For  0  <  L  <  2 R, 

Py( y)  =  fromEq.  (15)  andpc(y)  =  1  fromEq.  (16) 
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implies  that 


oo 

Pc  =  J2  Pr(V(A'i)  =  n)  =  1. 

n— 0 

(II)  For  2R<L<2R  +  r, 

p(na>0,nb>0)  __  ^  _|_  e -pL  _  2e~PR  _|_  e~p(,3R+r-L) 

+  {-\-\p(.L-2R))e~^R+2r-L) 

_  g -p(L+r-R)  _|_  ^_e-p(L+2r-2K) 

),nb>0)  _  _  g-pi  _|_  e~PR  _  Ig-P^fl+r-i) 

i  igP(i+r'-'R) 

2 

pc  =  1  +  Ie-p(i+2l'-2«) 


(na=0,7 
¥c 


(  1  1 
-  (-7  -  7^ 


4  ip(JL-2A))e-^+2’- 

For  2f?  - 


-i) 


r  <  L  <2R 

(na>0,nb>0) 

¥ c 


-2  r, 


l  _|_  g “Pi  _|_  lg-p(i-2fl)  _  e-p(L+r-fl) 

3 


L))e~P(2R+2r- 

l  -p(L+2r-2R) 

4 


-(-^-ip(2i?+2r 


■i) 


_  e-p(L-R- 


r) 


n(na=0,nb>0) 

r  c 


_e ~PL  _|_  Ig-p(i+r-Ji)  _|_  }_e-p(L-R-r) 
i  ,  lg-p(i-2ii)  ,  -p{L+2r-2R) 

+  2  +4 

3 


+  (  ~7  _  + 


-  2 r  -  L))e~p(2R+2r 


4  2 

>  2R  +  2 r, 

na>0,nb>0) 

-  g -PL  _|_  ig-p(i+27'-2fl)  _  e-p(L+r-R) 

4 

.  Ig-p(i-2fl-2r) 

A 


-L) 


_  e-p(L-R- 


3,nb>0) 

g-pi  +  Ie-p(L+r 


')  +  Ip-p(L-2fl) 
2 


-«)  +  Ie-p(i-fl- 


-r) 


-2A)  + 


Unit  Disk  Model,  R  =  1000m,  r  =  500m 


Fig.  3.  Access  probability  with  L  changing  under  the  unit  disk  model, 
R  =  1000m,  r  =  500m,  p  =  1/5, 1/50, 1/500  vehicles/m  respectively. 


where  prx  is  the  received  power  (in  dBmW)  at  the  destination 
vehicle;  po  is  the  power  (in  dBmW)  at  a  reference  distance  do; 
a  is  the  path  loss  exponent;  Na  is  a  Gaussian  random  variable 
with  zero  mean  and  variance  a2;  l  is  the  Euclidean  distance 
between  the  two  vehicles  (or  a  vehicle  and  a  BS  depending  on 
the  context).  A  source  vehicle  can  establish  a  direct  connection 
to  a  destination  vehicle  if  the  received  power  at  the  destination 
vehicle  prx  is  greater  than  or  equal  to  a  certain  threshold 
power  pvth.  Similarly,  a  source  vehicle  can  establish  a  two-way 
direct  connection  to  a  destination  BS  if  the  received  power  at 
the  destination  BS  prx  is  greater  than  or  equal  to  a  certain 
threshold  power  pbth.  In  this  paper,  we  assume  that  wireless 
connections  between  vehicles,  and  between  vehicles  and  BSs, 
are  symmetric.  Note  that  when  cr  =  0,  the  log-normal  model 
reduces  to  the  unit  disk  model.  Due  to  this  fact,  we  assign 
Pth  =  Po-  10alog10  fo,  pbth  =  Po  -  10alog10  £q  so  that 
the  results  under  log-normal  model  can  be  compared  with  the 
results  under  the  unit  disk  model  later.  It  can  be  shown  that 
under  the  log-normal  model 

9v(x)  =  PrO™  >  Pth)  =  loSio  ^)- 

2 

where  function  Q{y)  =  -y=  JJ°  dx  is  the  tail  probabil¬ 
ity  of  the  standard  normal  distribution.  Similarly,  gfc(x)  = 
Q(i^log10f).  When  a  =  0,  g£(x)  =  Pr(cr  <  r), 
9b  (x)  =  Pr(£  <  R)  and  the  log-normal  model  becomes  the 
unit  disk  model  as  expected. 

The  access  probability  can  then  be  obtained  for  different 
values  of  a  and  cr  by  computing  Eq.  (12)  using  any  numerical 
integration  technique.  The  approximate  and  exact  results  for 
the  connectivity  probability  can  be  obtained  by  computing 
Eq.  (13)  and  (18)  using  any  numerical  integration  technique. 


B.  Log-normal  Shadowing  Model 

The  log-normal  model  C  is  commonly  used  to  model 
the  real  world  signal  propagation  where  the  transmit  power 
loss  increases  logarithmically  with  the  Euclidean  distance 
between  two  wireless  nodes  and  varies  log-normally  due  to 
the  shadowing  effect  caused  by  surrounding  environment.  In 
the  log-normal  model,  we  formulate  the  received  power  (in 
dB)  at  a  destination  vehicle  as 

Prx  =Po~  10a  log10  /-  +  Na  (29) 

do 


VI.  Analytical  and  Simulation  Results 
A.  Unit  Disk  Model 

Fig.  3  shows  the  access  probability  given  different  values  of 
L  and  p.  The  analytical  results  are  verified  by  the  simulation 
results  which  are  obtained  from  40000  randomly  generated 
network  topologies.  As  the  number  of  instances  of  random 
networks  used  in  the  simulation  is  very  large,  the  confidence 
interval  is  too  small  to  be  distinguishable  and  hence  ignored 
in  this  plot  as  well  as  other  plots.  As  shown  in  the  figure,  the 
access  probability  decreases  with  L  when  L  exceeds  some 
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Fig.  4.  Connectivity  probability  with  L  changing  under  the  unit  disk  model, 
R  =  1000m,  r  =  500m,  p  =  1/5,  1/50, 1/500  vehicles/m  respectively. 

limits.  For  small  p,  the  access  probability  decreases  as  soon 
as  L  >  2 R.  That  is  because  when  the  vehicle  density  p 
(number  of  vehicles  per  meter)  is  low,  a  vehicle  is  either 
directly  connected  to  a  BS  or  disconnected,  i.e  cannot  reach 
any  BS  in  at  most  two  hops.  It  is  hard  for  the  vehicle  to  find 
a  one-hop  relay  in  its  range  via  which  it  can  access  a  BS  if 
it  is  not  within  the  transmission  range  of  any  BS.  However 
for  large  p,  it  is  easier  for  the  vehicle,  which  is  not  within 
the  transmission  range  of  any  BS,  to  find  a  one-hop  relay  to 
access  the  BS.  In  general  the  access  probability  increases  with 
an  increase  in  p,  and  the  reason  is  that  when  the  vehicle  density 
increases,  the  probability  increases  for  vehicles  in  the  gap  of 
the  transmission  ranges  of  BSs  to  find  a  neighbor  within  the 
transmission  range  of  a  BS  to  act  as  a  relay. 

Similarly,  Fig.  4  shows  the  connectivity  probability  for 
different  values  of  L  and  p.  The  exact  analytical  results  are 
verified  by  the  simulation  results.  The  approximate  analytical 
result  is  shown  to  be  reasonably  close  to  the  exact  analytical 
result.  The  figure  shows  that  when  L  <  2 R+r  =  2500  meters, 
it  is  easy  for  all  vehicles  to  be  connected  to  either  BS  in  at 
most  two  hops,  hence  the  connectivity  probability  is  high.  As 
L  gets  larger,  it  is  harder  for  all  vehicles  to  be  connected 
to  the  BSs  due  to  the  larger  possible  distances  between  the 
vehicles  and  the  BSs.  This  causes  a  drop  in  the  connectivity 
probability,  and  the  connectivity  probability  tends  to  zero  as  L 
goes  to  infinity.  The  transition  of  the  connectivity  probability 
from  1  to  0  gets  sharper  as  the  vehicle  density  increases.  As  p 
goes  to  infinity,  the  transition  happens  at  the  critical  distance 
L  =  2 R  +  2 r  =  3000  meters,  below  which  the  network  is 
disconnected  with  a  high  probability  and  above  which  the 
network  is  connected  with  a  high  probability.  Furthermore,  the 
networks  with  a  larger  p  have  a  higher  connectivity  probability 
than  the  networks  with  a  smaller  p  when  L  is  small.  This 
is  because  when  the  vehicle  density  is  large,  it  is  easier  for 
vehicles  not  directly  connected  to  a  BS  to  find  a  vehicle  within 
its  communication  range  and  is  directly  connected  to  a  BS  to 
act  as  a  relay.  When  L  is  large,  the  networks  with  a  larger  p 
have  a  lower  connectivity  probability  than  the  networks  with 
a  smaller  p.  This  is  because  at  large  values  of  L  when  the 
vehicle  density  is  large  it  is  easier  to  have  at  least  one  vehicle 
which  is  located  too  far  from  the  BSs  to  be  connected  to  a 
BS  in  at  most  two  hops. 

Fig.  5  shows  how  the  transmission  range  of  the  vehicles 
r  affect  the  access  probability.  It  shows  that  the  access 
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Fig.  5.  Access  probability  with  r  changing  under  the  unit  disk  model, 
R  =  1000m,  L  =  2500m,  p  =  1/5, 1/50, 1/500  vehicles/m  respectively. 
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Fig.  6.  Connectivity  probability  with  r  changing  under  the  unit  disk  model, 
R  =  1000m,  L  =  2500m,  p  =  1/5, 1/50, 1/500  vehicles/m  respectively. 

probability  increases  with  r,  and  when  p  is  large  enough,  the 
access  probability  could  be  quite  close  to  1.  And  it  shows 
again  that  the  access  probability  increases  with  an  increase  in 
P- 

With  a  similar  setup,  Fig.  6  shows  the  sensitivity  of  the 
connectivity  probability  to  r.  For  a  large  p,  around  a  certain 
value  of  r  a  small  increase  in  r  will  incur  a  dramatic  increase 
in  the  connectivity  probability  from  near  0  to  near  1,  i.e.  the 
well-known  phase  transition  phenomenon.  From  the  figure  it 
shows  that  such  phenomenon  does  not  exist  for  small  p.  Fig.  6 
also  shows  a  scenario  where  there  may  be  a  significant  gap 
between  the  approximate  and  exact  results  for  connectivity 
probability. 

Fig.  7  supported  our  conclusion  that  an  increase  in  p  will 
improve  the  access  probability  as  it  shows  that  the  access  prob¬ 
ability  monotonically  increases  with  p.  While  p  is  relatively 
small,  and  the  width  of  the  gap  region  not  directly  covered 
by  any  of  the  BSs  is  relatively  large,  the  access  probability 
will  be  low,  and  thus,  in  this  circumstance,  network  operator 
should  consider  to  deploy  more  BSs  along  the  highway  for 
better  connectivity  and  greater  access  probability. 

B.  Log-normal  Shadowing  Model 

Fig.  8  shows  the  access  probability  under  the  log-normal 
shadowing  model.  In  general,  it  is  easier  for  the  vehicles  in 
the  subnetwork  to  get  access  to  any  BS  under  the  log-normal 
model.  As  a  increases,  the  access  probability  improves.  The 
improvement  in  access  probability  is  more  significant  for  high 
vehicular  density. 
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Fig.  7.  Access  probability  with  p  changing  under  the  unit  disk  model,  L, 
R,  r  are  constants. 
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Fig.  8.  Access  probability  with  L  changing  under  the  log-normal  model, 
R  =  1000m,  r  =  500m,  p  =  1/5,1/500  vehicles/m  respectively  under 
different  values  of  a.  R  (r)  is  the  transmission  range  of  a  BS  (vehicle) 
ignoring  shadowing  effect,  i.e.  a  =  0. 


Fig.  9  shows  the  connectivity  probability  under  the  log¬ 
normal  model  when  the  vehicle  density  is  low  ( p  =  ^  vpm). 
The  exact  analytical  results  are  verified  by  the  simulation 
results.  As  the  vehicle  density  increases,  the  computational 
complexity  involved  in  numerically  computing  the  exact  result 
increases  very  quickly.  As  such,  we  only  provide  the  exact 
analytical  results  for  low  vehicle  density.  Furthermore,  Fig.  9 
shows  that  the  approximate  analytical  results  are  reasonably 
close  to  the  true  values  when  the  vehicle  density  is  low. 
However,  as  shown  in  Fig.  10,  the  discrepancy  between  the 
approximate  results  and  the  true  values  can  be  significant 
when  the  vehicle  density  is  high  (p  =  i  vpm).  In  general, 
the  approximate  analytical  result  always  under-estimate  the 
simulation  result.  Same  situation  can  be  observed  for  the 
result  under  the  unit  disk  model.  This  can  be  explained  by 
Lemma  3  that  a  vehicle  is  more  likely  to  be  able  to  access  to 
any  BS  where  there  is  another  vehicle  nearby  that  can  access 
to  the  BSs.  Because  of  the  independence  assumption  used  in 
obtaining  the  approximate  analytical  result,  the  approximate 
result  will  under-estimate  the  true  value. 

VII.  Conclusions 

In  this  paper,  we  analyzed  the  connectivity  probability 
and  the  access  probability  for  a  given  network  bounded  by 
two  adjacent  base  stations,  and  vehicles  in  the  network  are 
Poissonly  distributed  with  known  density  and  each  vehicle  can 
communicate  with  a  base  station  in  at  most  two  hops.  Under 


Fig.  9.  Connectivity  probability  with  L  changing  under  the  log-normal 
model,  R  =  1000m,  r  =  500m,  p  =  1/500  vehicles/m  under  different 
values  of  a.  R  (r)  is  the  transmission  range  of  a  BS  (vehicle)  ignoring 
shadowing  effect,  i.e.  a  =  0. 
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Fig.  10.  Connectivity  probability  with  L  changing  under  the  log-normal 
model,  R  =  1000m,  r  =  500m,  p  =  1/5,1/50  vehicles/m  respectively 
under  different  values  of  cr.  17  (r)  is  the  transmission  range  of  a  BS  (vehicle) 
ignoring  shadowing  effect,  i.e.  a  =  0. 


a  general  connection  model,  and  later  on  taking  the  unit  disk 
communication  model  and  the  log-normal  shadowing  model 
as  the  specific  examples,  we  derived  closed-form  formulas  for 
the  access  probability  and  connectivity  probability  considering 
that  the  base  stations  and  the  vehicles  have  different  trans¬ 
mission  capabilities.  These  formulas  characterize  the  relation 
between  these  key  parameters,  i.e.  the  transmission  ranges 
of  the  base  stations  and  the  vehicles,  the  distance  between 
adjacent  base  stations,  the  vehicle  density  and  their  impact 
on  the  access  and  connectivity  probabilities.  These  results  can 
be  useful  for  a  network  operator  to  design  a  network  with  a 
given  level  of  access  guarantee.  In  future,  we  plan  to  extend 
the  current  work  on  1-D  networks  to  2-D  networks. 
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Abstract — Wireless  multi-hop  networks  are  being  increasingly 
used  in  military  and  civilian  applications.  Connectivity  is  a 
prerequisite  in  wireless  multi-hop  networks  for  providing  many 
network  functions.  In  a  wireless  network  with  many  concurrent 
transmissions,  signals  transmitted  at  the  same  time  will  mutually 
interfere  with  each  other.  In  this  paper  we  consider  the  impact  of 
interference  on  the  connectivity  of  CSMA  networks.  Specifically, 
consider  a  network  with  n  nodes  uniformly  and  i.i.d.  on  a  square 

[  ^  ’  'i"  ]  where  a  node  can  only  transmit  if  the  sensed  power 

from  any  other  active  transmitter  is  below  a  threshold,  i.e. 
subject  to  the  carrier-sensing  constraint,  and  the  transmission 
is  successful  if  and  only  if  the  SINR  is  greater  than  or  equal  to 
a  predefined  threshold.  We  provide  a  sufficient  condition  and  a 
necessary  condition,  i.e.  an  upper  bound  and  a  lower  bound  on 
the  transmission  power,  required  for  the  above  network  to  be 
asymptotically  almost  surely  (a.a.s.)  connected  as  n  — »  oo.  The 
two  bounds  differ  by  a  constant  factor  only  as  n  — >  oo.  It  is  shown 
that  the  transmission  power  only  needs  to  be  increased  by  a 
constant  factor  to  combat  interference  and  maintain  connectivity 
compared  with  that  considering  a  unit  disk  model  (UDM)  without 
interference.  This  result  is  also  in  stark  contrast  with  previous 
results  considering  the  connectivity  of  ALOHA  networks  under 
the  SINR  model. 

Index  Terms — Connectivity,  CSMA,  Wireless  Network. 


I.  Introduction 

Wireless  multi-hop  networks  are  being  increasingly  used  in 
military  and  civilian  applications.  Connectivity  is  a  prereq¬ 
uisite  in  wireless  multi-hop  networks  for  providing  many 
network  functions  (e.g.  routing,  localization  and  topology 
control)  [1] — [3].  The  scaling  behavior  of  the  connectivity 
property  when  the  network  becomes  sufficiently  large  is  of 
particular  interest.  A  wireless  multi-hop  network  is  said  to  be 
connected  if  and  only  if  (iff)  there  is  at  least  one  (multi-hop) 
path  between  any  pair  of  nodes  in  the  network. 

Due  to  the  nature  of  wireless  communications,  signals  trans¬ 
mitted  at  the  same  time  will  mutually  interfere  with  each  other. 
The  SINR  (signal  to  interference  plus  noise  ratio)  model  has 
been  widely  used  to  capture  the  impact  of  interference  on 
network  connectivity  [2],  [4],  [5].  Under  the  SINR  model, 
the  existence  of  a  directional  link  between  a  pair  of  nodes 
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is  determined  by  the  strength  of  the  received  signal  from  the 
desired  transmitter,  the  interference  caused  by  other  concurrent 
transmissions  and  the  background  noise.  Assume  all  nodes  use 
the  same  transmit  power  P  and  let  x  j. ,  k  GT,  be  the  location 
of  node  k,  where  T  represents  the  set  of  indices  of  all  nodes  in 
the  network.  A  node  j  can  successfully  receive  the  transmitted 
signal  from  a  node  i  (i.e.  node  j  is  directly  connected  to  node 
i)  if  the  SINR  at  x3 ,  denoted  by  SINR  (x,  —?  x3 ) ,  is  above  a 
prescribed  threshold  /3,  i.e. 


SINR  ( Xi  -A  Xj) 


PI  (Xi ,  Xj  )  ^ 

iVo+7E  x3)  ~ 

k&Ti 


0) 


where  7i  C  T  denotes  the  subset  of  nodes  transmitting  at 
the  same  time  as  node  i  and  Nq  is  the  background  noise 
power.  The  function  l  (xL,  x3 )  is  the  power  attenuation  from 
x,  to  Xj.  The  coefficient  0  <  7  <  1  is  the  inverse  of  the 
processing  gain  of  the  system  and  it  weighs  the  impact  of 
interference.  In  a  broadband  system  using  CDMA,  7  depends 
on  the  orthogonality  between  codes  used  during  concurrent 
transmissions  and  7  <  1;  in  a  narrow-band  system,  7  =  1  [2], 
[5].  Similarly,  node  i  can  receive  from  node  j  (i.e.  node  i  is 
directly  connected  to  node  j )  iff 


SINR  (xj  Xi) 


_ -PI  (Xj,  Xj) _  ^  n 

^0  +  7  1^  ( xk ,  Xi)  ~  " 
fce7) 


(2) 


Therefore  node  i  and  node  j  are  directly  connected,  i.e.  a 
bidirectional  link  exists  between  node  i  and  node  j,  iff  both 
(1)  and  (2)  are  satisfied. 

Dousse  el  al.  [5]  use  the  SINR  model  to  analyze  the  impact  of 
interference  on  connectivity  from  the  percolation  perspective. 
They  consider  a  network  where  all  nodes  are  distributed 
in  ft2  following  a  homogeneous  Poisson  point  process  with 
a  constant  intensity  A  and  an  attenuation  function  I  with 
bounded  support.  By  letting  7j  =  T/{i,j},  i.e.  all  other 
nodes  in  the  network  transmit  simultaneously  with  node  i 
irrespective  of  their  relative  locations  to  x3  and  x j,  it  is  shown 
that  there  exists  a  very  small  positive  constant  7'  such  that 
if  7  >  7'  there  is  no  infinite  connected  component  in  the 
network,  i.e.  the  network  does  not  percolate.  Further,  when 
7  <  7',  there  exists  0  <  A'  <  00  such  that  percolation 
can  occur  when  A  >  A'.  An  improved  result  by  the  same 
authors  in  [6]  shows  that  under  the  more  general  conditions 
that  A  >  Ac  and  the  attenuation  function  has  unbounded 
support,  percolation  occurs  when  7  <  7'.  Here  Ac  is  the 
critical  node  density  above  which  the  network  with  7  =  0  (i.e. 
the  unit  disk  model  (UDM)  without  interference)  percolates  [7, 
p48].  These  results  suggest  that  percolation  under  the  SINR 
model  can  happen  iff  7  is  sufficiently  small.  They  assume  that 
each  node  transmits  randomly  and  independently,  irrespective 
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of  any  nearby  transmitter.  This  corresponds  to  the  ALOHA- 
type  multiple  access  scheme  [2],  which  however  has  become 
obsolete  [8]. 

The  more  advanced  multiple  access  strategies,  e.g.  CSMA 
and  CSMA/CD  (Carrier  Sense  Multiple  Access  with  Colli¬ 
sion  Detection)  [9]  have  become  prevailing  with  widespread 
adoption.  The  general  idea  of  CSMA  schemes  is  that  nearby 
nodes  will  not  be  scheduled  to  transmit  simultaneously,  i.e., 
a  minimum  separation  distance  is  imposed  among  concurrent 
transmitters.  Therefore,  it  is  natural  to  expect  that  CSMA  could 
improve  the  performance  of  ALOHA  schemes  by  alleviating 
interference,  particularly  under  heavy  traffic.  On  the  other 
hand,  this  distance  constraint  leads  to  a  spatial  correlation 
problem  which  means  that  the  location  of  a  transmitter  is 
dependent  on  the  location  of  other  concurrent  transmitters. 
Therefore,  even  if  all  nodes  are  initially  distributed  following  a 
Poisson  point  process  (PPP),  the  set  of  concurrent  transmitters 
cannot  be  obtained  by  independent  thinning  of  the  PPP. 
Thus,  the  set  of  concurrent  transmitters  no  longer  forms  a 
PPP  but  a  more  complicated  point  process.  Matern  hard-core 
point  process  are  widely  used  to  model  the  set  of  concurrent 
transmitters  [  10]— [  12] .  However,  distribution  of  such  hard-core 
process  is  difficult  to  analyze  and  a  closed-form  expression 
is  yet  to  be  obtained  [  10]— [  14] .  In  this  paper,  we  use  an 
entirely  different  approach.  Particularly  by  investigating  the 
bounds  on  interference,  instead  of  an  accurate  characterization 
of  interference  distribution,  we  are  able  to  avoid  the  above 
mentioned  difficulty  in  finding  the  accurate  distribution  of 
concurrent  transmitters  and  the  associated  interference. 


In  this  paper,  we  analyze  the  connectivity  of  wireless  CSMA 
networks  under  the  SINR  model.  Specifically,  we  con¬ 
sider  a  network  with  n  nodes  uniformly  i.i.d.  on  a  square 
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and  each  node  is  capable  of  performing  carrier- 
sensing  operation.  A  pair  of  nodes  are  directly  connected  iff 
both  (1)  and  (2)  are  satisfied.  Further,  the  attenuation  function 
assumes  a  power-law  form,  the  same  model  considered  in  [5], 
[6].  The  contributions  of  this  paper  are: 


1)  We  show  that  the  interference  experienced  by  any  re¬ 
ceiver  in  the  network  is  upper  bounded.  Based  on  this 
result,  we  further  show  that  for  an  arbitrarily  chosen 
SINR  threshold,  there  exists  a  transmission  range  Rq 
such  that  a  pair  of  nodes  are  directly  connected  if 
their  Euclidean  distance  is  smaller  than  or  equal  to  Rq. 
On  that  basis,  we  derive  a  sufficient  condition,  i.e.  an 
upper  bound  on  the  transmission  power,  for  the  CSMA 
network  to  be  a.a.s.  connected  under  the  SINR  model 
asn->  oo.  An  event  £n  depending  on  a  is  said  to  occur 
a.a.s.  if  its  probability  tends  to  1  as  n  — »  oo. 

2)  We  provide  a  necessary  condition,  i.e.  a  lower  bound  on 
the  transmission  power,  for  the  CSMA  network  to  be 
a.a.s.  connected.  The  lower  bound  is  a  tight  bound  and 
differs  from  the  upper  bound  by  a  constant  factor  only. 

3)  We  show  that  the  transmission  power  only  needs  to  be 
increased  by  a  constant  factor  to  combat  interference  and 
maintain  connectivity  compared  with  that  considering  a 
UDM  without  interference.  This  result  is  in  stark  con¬ 
trast  with  previous  results  considering  the  connectivity 


of  ALOHA  networks  [5],  [6]  under  the  SINR  model 
which  shows  that  connectivity  is  much  harder  to  achieve 
in  the  presence  of  interference  and  is  impossible  in  a 
narrow  band  system  where  7=1. 

The  remainder  of  this  paper  is  organized  as  follows:  Section 
II  reviews  related  work;  Section  III  defines  network  and 
connection  models.  In  Section  IV  we  first  derive  an  upper 
bound  on  the  interference  in  CSMA  networks.  Based  on 
the  upper  bound,  a  sufficient  condition  for  connectivity  is 
obtained;  Section  V  investigates  a  necessary  condition  for  a 
connected  CSMA  network;  finally  Section  VI  concludes  the 
paper  and  discusses  future  work. 

II.  Related  Work 

The  literature  is  rich  in  studying  connectivity  using  the  well- 
known  random  geometric  graph  and  the  UDM,  which  is 
usually  obtained  by  randomly  and  uniformly  distributing  n 
nodes  in  a  given  area  and  connecting  any  two  nodes  iff 
their  Euclidean  distance  is  smaller  than  or  equal  to  a  certain 
threshold  r  (n),  known  as  the  transmission  range.  This  model 
corresponds  to  a  special  case  of  the  SINR  model  in  (1),  i.e. 
when  7  =  0  (perfect  orthogonality,  no  interference).  Signif¬ 
icant  outcomes  have  been  achieved  for  both  asymptotically 
infinite  n  [1],  [15]  and  for  finite  n  [16] — [18].  Particularly, 
Penrose  [15]  and  Gupta  and  Kumar  [1]  prove  that  under  the 
UDM  and  in  a  disk  of  unit  area,  the  above  network  with  a 
transmission  range  of  r  (n)  =  log  is  a.a.s.  connected 

as  n  -A  oo  iff  c  (n)  -A  oo.  Most  of  the  results  for  finite  n  are 
empirical  results. 

The  work  [3],  [19]— [21]  investigate  the  necessary  condition 
for  the  above  network  to  be  a.a.s.  connected  under  the  more 
realistic  log-normal  connection  model,  where  two  nodes  are 
directly  connected  if  the  received  power  at  one  node  from  the 
other  node,  whose  attenuation  follows  the  log-normal  model 
[9],  is  greater  than  a  given  threshold.  These  results  however 
rely  on  the  assumption  that  the  node  isolation  events  are 
independent,  which  is  yet  to  be  proved. 

Despite  the  significant  impact  of  interference  caused  by  con¬ 
current  transmissions  on  connectivity,  limited  work  exists  on 
analyzing  connectivity  under  the  SINR  model.  In  [22],  [23], 
the  authors  study  connectivity  from  the  perspective  of  channel 
assignment.  Specifically,  channel/time  slots  are  assigned  to 
each  link  for  all  active  links  to  be  simultaneously  transmitting 
while  satisfying  the  SINR  requirement.  The  two  papers  [5], 
[6]  discussed  in  Section  I  study  the  impact  of  interference 
on  the  connectivity  of  ALOHA  networks  from  the  percolation 
perspective. 

Mao  et  al.  [24],  [25]  study  the  connectivity  problem  under  a 
generic  random  connection  model,  viz.  two  nodes  separated  by 
a  Euclidean  distance  x  are  directly  connected  with  probability 
g(x),  where  g  :  [0,  oo)  — >  [0, 1]  satisfies  the  properties  of 
integral  boundedness,  rotational  invariance  and  non-increasing 
monotonicity  [7],  independent  of  the  event  that  another  pair 
of  nodes  are  directly  connected.  The  authors  establish  the 
requirements  for  an  a.a.s.  connected  network. 
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A  major  difficulty  in  moving  to  the  SINR  model  is  that 
under  the  unit  disk  model  or  the  random  connection  model, 
connections  are  assumed  to  be  independent,  i.e.  the  event  that  a 
pair  of  nodes  are  directly  connected  and  the  event  that  another 
distinct  pair  of  nodes  are  directly  connected  are  independent. 
This  independence  assumption  on  connections  is  critical  in 
the  analysis  of  connectivity  under  these  two  models.  In  the 
SINR  model  however,  due  to  the  presence  of  interference, 
the  existence  of  a  direct  connection  between  a  pair  of  nodes 
depends  on  both  the  location  and  the  activities  of  other  nodes 
in  the  network. 

Some  other  work  exists  on  modeling  the  point  process  formed 
by  concurrent  transmitters.  The  Matern  hard-core  point  process 
are  widely  used  to  model  the  set  of  concurrent  transmitters  in 
a  CSMA  network  [10]— [13],  However,  such  hard-core  process 
are  difficult  to  analyze.  Consequently,  some  work  [12],  [26], 
[27]  uses  PPP  to  approximate  the  distribution  of  concurrent 
transmitters.  A  recent  work  [14]  compared  the  mean  inter¬ 
ference  in  hard-core  process  and  in  the  corresponding  PPP 
approximation. 


III.  Network  Models 


We  consider  a  network  with  n  nodes  uniformly  and  i.i.d.  on  a 
square  —  v^rl ,  ,  i.e.  the  so-called  extended  network  model 

[2],  where  the  network  size  scales  with  the  network  area  while 
the  node  density  is  fixed.  All  nodes  use  the  same  transmission 
power  P  and  there  is  always  a  packet  at  a  node  waiting  to 
be  transmitted.  The  later  assumption  allows  us  to  focus  on  the 
network  property  without  being  disturbed  by  other  factors,  e.g. 
traffic  distribution. 


A.  Attenuation  and  interference 

We  consider  that  the  attenuation  function  £  in  (1)  and  (2)  only 
depends  on  Euclidean  distance  and  is  a  power-law  function 
[5],  [6]  £  (s)  =  s~a  where  s  represents  the  Euclidean  distance 
between  a  pair  of  nodes  and  a  is  the  path-loss  exponent, 
which  typically  varies  from  2  to  6  [9,  pl39].  In  this  paper 
we  assume  a  >  2.  Note  that  when  a  <  2,  the  interference 
experienced  by  a  receiver  in  the  CSMA  network  investigated 
can  not  be  bounded  by  a  constant.  The  above  assumptions  on  £ 
are  widely  used  [2],  [5],  [12]  and  supported  by  measurement 
studies  [9].  As  commonly  done  in  the  connectivity  analysis 
[1],  [5] — [7],  [15],  the  impact  of  small-scale  fading  is  ignored 
and  only  bidirectional  communication  links  are  considered. 
Further,  since  in  dense  sensor  networks  and  cellular  networks 
the  background  noise  is  typically  negligibly  small  [2],  [12],  we 
ignore  the  background  noise  N0  in  (1)  and  (2).  In  addition, 
we  consider  that  all  nodes  use  the  same  channel,  i.e.  7  =  1, 
which  corresponds  to  a  narrow-band  system  [2],  [5]. 


in  (1)  and  (2)  are  below  a  certain  detection  threshold  P±. 
From  the  power-law  path  loss,  the  carrier-sensing  range  Rc, 
which  determines  the  minimum  Euclidean  distance  between 
two  concurrent  transmitters,  is  given  by 

Rc  =  ( P/P±)1/a  (3) 

One  may  alternatively  consider  a  scenario  where  a  node 
transmits  when  the  aggregated  interference  is  below  If,  which 
forms  a  trivial  extension  of  the  scenario  considered  in  this 
paper. 

IV.  A  Sufficient  Condition  for  Asymptotically 
Almost  Surely  Connectivity 

A  major  challenge  in  connectivity  analysis  under  the  SINR 
model  is  that  the  existence  of  a  direct  connection  between  a 
pair  of  nodes  depends  on  both  the  locations  and  activities  of 
other  nodes  in  the  network,  i.e.  connections  are  correlated. 
In  this  paper,  we  resort  to  a  coupling  approach  to  handle  the 
connection  correlations.  The  main  idea  of  coupling  technique 
is  to  build  the  connection  between  a  more  complicated  model 
and  a  simpler  model  with  established  results  such  that  if  a 
property,  e.g.  connectivity,  is  true  in  the  simpler  model,  it  will 
also  be  true  in  the  more  complicated  model.  Therefore  the 
property  of  the  more  complicated  model  can  be  studied  by 
studying  the  simpler  counterpart. 

Specifically,  we  first  establish  an  upper  bound  on  the  interfer¬ 
ence  experienced  by  any  receiver  in  CSMA  networks.  On  that 
basis,  we  show  that  for  an  arbitrarily  chosen  SINR  threshold, 
there  exists  a  transmission  range  R0  such  that  a  pair  of  nodes 
are  directly  connected  if  their  Euclidean  distance  is  smaller 
than  or  equal  to  Rq.  Then  we  can  use  existing  results  on 
connectivity  under  the  UDM  to  analyze  connectivity  under 
the  SINR  model. 


A.  An  upper  bound  on  interference  and  the  associated  trans¬ 
mission  range 

The  following  theorem  provides  an  upper  bound  on  the  inter¬ 
ference. 


Theorem  1.  Consider  a  CSMA  network  with  nodes  distributed 
arbitrarily  on  a  finite  area  in  3?2.  Denote  by  Tq  the  Euclidean 
distance  between  a  receiver  and  its  nearest  transmitter  in  the 
network,  which  is  also  the  intended  transmitter  for  the  receiver. 
When  ?’o  <  Rc,  the  maximum  interference  experienced  by 
the  receiver  is  smaller  than  or  equal  to  N  (ro)  =  N\  (ro)  + 
N2,  where 


N\  (r0)  = 
3  P 


4 P  Rc  -  r0) 1  “  (3a  -  1)  Rc  -  r0) 

i?2  (a  -  1)  (a  -  2) 

3 P  3P  (|i?c  —  r0)1-a 


(- Rc-r0)a  ( V3Rc-r0)c 


(a  -  1)  Rc 


(4) 


B.  Carrier-sensing 

In  CSMA  networks,  two  nodes  located  at  x ,  and  Xj  can  re¬ 
spectively  transmit  simultaneously  iff  they  can  not  detect  each 
other’s  transmission,  i.e.  both  P£(x.i,  Xj)  and  P£{xj ,  xf) 


3  P  3P(lfi-a  3  P 
Rf  +  (a-l)R«  +  (V3  Rc)a 
|  3P(|)1~a(3a-l) 

(a  —  1)  (a  —  2)  ( V3Rc)a 
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Proof:  See  Appendix  I.  ■ 

Remark  2.  The  upper  bound  in  Theorem  1  is  valid  for  any 
node  distribution.  For  a  sparse  network  or  a  network  where 
nodes  are  placed  in  a  coordinated  or  planned  manner,  replacing 
Rc  with  the  minimum  distance  among  concurrent  transmitters. 
Theorem  1  can  be  extended  to  be  applicable. 

Remark  3.  The  assumption  that  ro  <  Rc  is  valid  in  most 
wireless  systems  which  not  only  require  the  SINR  to  be  above 
a  threshold  and  also  require  the  received  signal  to  be  of  suf¬ 
ficiently  good  quality.  However  Theorem  1  does  not  critically 
depend  on  the  assumption.  For  ro  >  Rc,  so  long  as  there  exists 
a  positive  integer  c  such  that  ro  <  cRc  the  upper  bound  can  be 
revised  to  accommodate  the  situation  by  changing  the  range 
of  the  summation  in  (20)  (in  Appendix  I)  from  [3,  oo]  and 
[2,oo]  to  [c  +  2,oo]  and  [c+l,oo]  respectively  and  revising 
the  results  accordingly. 

The  following  result  can  be  obtained  as  a  ready  consequence 
of  Theorem  1 . 

Corollary  4.  Under  the  same  settings  as  in  Theorem  1,  assume 
that  the  SINR  threshold  in  (1)  and  (2)  is  (3.  There  exists  a 
transmission  range  Rq  <  Rc  such  that  a  pair  of  nodes  are 
directly  connected  if  their  Euclidean  distance  is  smaller  than 
or  equal  to  Ro,  given  implicitly  by 

PRfia/N(R0)  =  p  (6) 


Proof:  Theorem  1  establishes  that  the  interference  ex¬ 
perienced  by  a  receiver  2  at  ro  from  its  transmitter  w, 
denoted  by  I  (ro)  is  upper  bounded  by  N  (r0).  Note  that, 
for  ro  <  Rc,  N  (ro)  is  increasing  with  ro  and  Prfa  is 
decreasing  with  ro-  Therefore,  using  (6)  the  SINR  of  a  receiver 
at  ro  <  Rq  from  its  transmitter,  denoted  by  SINR  (ro),  satisfies 


SINR(ro) 


Pra 

I(r0) 


>  PTo_ 


-  N{ro) 


By  symmetry,  when  the  transmission  occurs  in  the  opposite 
direction,  i.e.  from  z  to  w,  the  interference  generated  by  the 
set  of  nodes  that  are  transmitting  at  the  same  time  as  z  is  also 
upper  bounded  by  N  (ro).  Therefore  the  SINR  at  w  is  also 
greater  than  or  equal  to  /3. 


Finally  the  existence  of  a  (unique)  solution  to  (6)  can  be 

P p  ^  Pp  ^ 

proved  by  noting  that  ->  oo  as  r0  ->  0,  0 

Pjg  ^  •  •  •  • 

as  ro  — >  Rf  and  that  N°rg^  is  monotonically  decreasing  with 

ro-  ^  ■ 


Corollary  4  relates  R0  to  P  and  allows  the  computation  of  R0 
given  P  and  the  converse.  A  more  convenient  way  to  study 
the  relation  between  P  and  If  is  by  noting  that  P  =  If  Rf 
and  considering  Rq  as  a  function  of  Rc.  Using  (4),  (5)  and 
letting  =  x,  (6)  can  be  rewritten  as 


1 

1 


a(^x-  lV  “  (  *=» 


)-($(*-l),-l)  3 


x2  (a  —  1)  (a  —  2) 

,  3d*-!)1 


(x~l)c 
3(f)1- 


(a  —  1)  x  xa  (a  —  1) 
3  (I)1"0  (3a-!) 


(V3x-l)a 

3  3 

xa  (s/3x)  (a  —  1)  (a  —  2)  (V3x) 


(7) 


Figure  1.  Variation  of  the  ratio  JL  with  the  SINR  requirement  ft  when  the 
path  loss  exponent  a  equals  to  2.5,  3,  4,  respectively. 


Figure  1  shows  the  ratio  g  as  a  function  of  B.  Different 
curves  represent  different  choices  of  the  path  loss  exponent  a. 
For  instance,  when  p  =  10  and  a  =  4,  we  have  Bfi  =  3.6. 


B.  A  sufficient  condition  for  connectivity 

Based  on  the  transmission  range  Rq  derived  in  Corollary  4, 
we  obtain  another  main  result: 


Theorem  5.  Consider  a  CSMA  network  with  a  total  of  n 


nodes  i.i.d.  on  a  square 


following  a  uniform 


distribution.  A  pair  of  nodes  are  directly  connected  iff  both 
(1)  and  (2)  (7  =  1  and  Nq  =  0  in  (1)  and  (2)  )  are  satisfied. 
As  n  -4  00,  the  above  network  is  a.a.s.  connected  if  the 
transmission  power 

P  =  Pthb\  (log  n  +  c  (n))  ® .,  (8) 

where  bi  =  b'  / \ffi,  c(n)  =  o(log?r)  and  c(n)  — >  00  as  n  — > 
00  and  00  >  b'  >  1  is  the  solution  to  (7)  (By  f  (x)  =  o  (g  (x)), 

f  («E) 

we  mean  that  lim  — — -  =  0.). 

x^oo  g  (x) 


Proof:  The  results  in  [1],  [15]  show  that,  for  a  network 
with  a  total  of  n  nodes  uniformly  i.i.d.  on  a  y/n  x  y/n  square 
and  a  pair  of  nodes  are  directly  connected  iff  their  Euclidean 
distance  is  smaller  than  or  equal  to  a  given  threshold  r  (n) 
(i.e.,  UDM),  the  network  is  a.a.s.  connected  as  n  -4  00  iff 
r  (n)  =  y /l°gn+c(jf)_  where  c  (n)  — ►  00  as  n  — >  00.  Using  this 
result,  (7)  (letting  b'  =  jr).  Corollary  4  and  Theorem  1,  the 
result  in  the  theorem  follows.  ■ 

The  implication  of  Theorem  5  is  that  in  CSMA  networks,  since 
the  interference  is  bounded  above  by  a  constant  almost  surely 
as  shown  in  Theorem  1,  to  meet  an  arbitrarily  high  (albeit 
constant  with  the  increase  in  n)  3,  the  power  only  needs  to 
be  increased  by  a  constant  factor  compared  with  that  in  the 
unit  disk  model  to  maintain  the  same  set  of  connections.  This 
result  is  in  contrast  to  the  ALOHA  networks  considered  in  [5], 
[6]  in  which  percolation  only  occurs  for  a  sufficiently  small 
7- 


V.  A  Necessary  Condition  for  Asymptotically 
Almost  Surely  Connectivity 

Section  IV  derives  a  sufficient  condition  for  a  connected 
CSMA  network  as  n  — >  00  in  the  presence  of  interference. 
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A  logical  question  arises:  what  is  the  necessary  condition  for 
the  same  CSMA  network  to  be  connected  as  n  — >  oo. 

In  a  CSMA  network,  any  set  of  nodes  can  transmit  si¬ 
multaneously  as  long  as  the  carrier-sensing  constraints  are 
satisfied.  Further,  in  a  large-scale  network,  scheduling  is 
often  performed  in  a  distributed  manner.  In  the  absence  of 
accurate  global  knowledge  on  which  particular  set  of  nodes 
are  simultaneously  transmitting  at  a  particular  time  instant,  it 
is  natural  that  a  node  sets  its  transmission  power  to  be  above 
the  minimum  transmission  power  required  for  a  network  to 
be  connected  under  any  scheduling  algorithm  (It  is  trivial 
to  show  that,  see  also  the  proof  of  Lemma  6,  when  the 
transmission  power  increases,  connectivity  will  also  improve). 
Denote  that  minimum  power  by  If,  where  O  represents 
the  set  of  all  scheduling  algorithms  satisfying  the  carrier¬ 
sensing  constraints.  In  this  section,  we  investigate  If,,  i.e. 
a  necessary  condition  required  for  connectivity  as  n  — >  oo. 
This  is  done  by  analyzing  the  transmission  power  required 
for  the  above  network  to  have  no  isolated  node  which  is 
a  necessary  condition  for  having  a  connected  network.  The 
following  lemma  is  required  for  the  analysis  of  If, : 

Lemma  6.  Denote  by  Pq  (respectively,  Pw)  the  minimum 
transmission  power  required  for  the  network  to  have  no 
isolated  node  under  any  scheduling  (respectively,  under  a 
particular  scheduling  oj).  We  have  Pq  >  Pq  =  maxw6n  Pu. 

Proof:  We  prove  the  lemma  by  showing  that  the  minimum 
transmission  power  required  for  the  network  to  have  no 
isolated  node  under  any  scheduling  has  to  be  greater  than  or 
equal  to  the  minimum  transmission  power  required  for  the 
same  network  to  have  no  isolated  node  under  a  particular 
scheduling. 

Define  a  set  of  nodes  that  can  simultaneously  transmit  while 
satisfying  the  carrier-sensing  constraints  as  an  independent  set. 
Obviously  the  independent  set  depends  on  the  transmission 
power  of  nodes.  As  the  transmission  power  decreases,  other 
things  being  equal,  Rc  will  decrease  and  the  number  of  nodes 
that  can  simultaneously  transmit  will  increase  or  remain  the 
same. 

Denote  by  ft  a  set  of  nodes  that  are  scheduled  to  transmit 
simultaneously  in  the  CSMA  network.  It  follows  that  ft  must 
be  an  independent  set.  Given  ft,  a  node  v  £  ft  is  isolated 
if  there  is  no  node  in  the  network  that  can  successfully 
receive  from  it  when  the  nodes  in  ft  are  simultaneously 
transmitting.  Further,  as  explained  in  the  last  paragraph,  the 
independent  set  depends  on  the  transmission  power.  When 
the  transmission  power  is  decreased  from  Pi  to  P2,  where 
P2  <  Pi,  if  ft  is  an  independent  set  at  power  level  Pi,  it 
will  also  be  an  independent  set  at  power  level  P2.  Based  on 
the  above  observation  and  using  (1)  and  (2),  a  decrease  in 
the  transmission  power  will  cause  a  decrease  in  the  SINR,  it 
readily  follows  that  if  a  node  v  £  ft  is  isolated  at  power  level 
Pi  when  the  set  of  active  transmitters  is  ft ,  it  will  also  be 
isolated  at  power  level  P2  when  the  set  of  active  transmitters 
is  ft.  For  any  transmission  power  less  than  Ift  =  maxweQ  Pu, 
there  exists  a  scheduling  that  will  result  the  network  to  have  an 
isolated  node  at  that  power  level.  Therefore,  I\>  has  to  satisfy 


Figure  2.  An  illustration  of  the  hexagonal  partition  of  the  network  area.  The 
shaded  hexagons  represent  simultaneously  active  hexagons,  where  k  =  3. 


Pn  =  maxwSnPw.  ■ 

Remark  7.  As  an  easy  consequence  of  Lemma  6,  the  prob¬ 
ability  that  a  CSMA  network  has  no  isolated  node  is  a  non¬ 
increasing  function  of  the  transmission  power. 

Now  the  task  becomes  constructing  a  particular  scheduling 
which  gives  as  large  Pu  as  possible,  i.e.  a  tight  lower  bound 
on  Pft  Next  we  construct  such  scheduling  ui  heuristic  ally. 


A.  Construction  of  scheduling  algorithm  to 


Obviously,  to  needs  to  satisfy  the  constraint  on  the  minimum 
separation  distance  between  concurrent  transmitters  imposed 
by  the  carrier-sensing  requirement.  Meanwhile,  to  needs  to 
schedule  as  many  concurrent  transmissions  as  possible  to 
maximize  interference,  hence  Pu. 


We  start  with  a  lemma  that  is  required  for  the  construction  of 

to. 

r  An  Atl  2 

Lemma  8.  Partition  the  square  ''.f .  ft'  into  non¬ 
overlapping  hexagons  of  equal  side  length  s„  such  that  the 
origin  o  coincides  with  the  centre  of  a  hexagon  and  two 
diagonal  vertices  of  this  hexagon,  whose  Euclidean  distance 
is  2 sn,  are  located  on  y  axis,  as  shown  in  Figure  2.  We  call  a 
hexagon  an  interior  hexagon  if  it  is  entirely  contained  in  the 

square  ■  When  sn  =  ft (2  log  n)  /5,  a.a.s.  each 

interior  hexagon  is  occupied  by  at  least  one  node  as  n  — >  oo. 


Proof:  Because  nodes  are  uniformly  i.i.d.,  the  probability 
that  an  arbitrary  interior  hexagon  is  empty  is  ^1  —  . 

Let  ft  be  the  event  that  an  interior  hexagon  i  is  empty,  where 
i  £  E,  and  E  denotes  the  set  of  indices  of  all  interior  hexagons. 
There  are  at  most  3Jfs2  interior  hexagons. 
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interior  hexagon  in 


.  It  follows  that  Pr  (An)  = 


Denote  by  An  the  event  that  there  is  at  least  one  empty 

y/n  \fn 
2  ’  2 

Pr(Uj£H^i)-  Using  union  bound,  we  have  Pr(Ui6H^)  < 

3\/3  a\  ' 

“■  I 

-.  Using  the  fact  that  1  —  x  < 


Lies*  (6)  < 


2  n  1-- 


3  s/3s 


exp  (—a;)  and  sn  =  .  /21°gra,  we  have  limPr(An)  < 

V  0  n— >-oo 


3v^s; 


lim  2"^  A 

n—fo o  ^  V 

the  proof. 


=  lim 


5ra 


n— ^°°  3\/3n  5  logn 


n—ioo 

=  0  which  completes 


Hereinafter,  we  declare  a  hexagon  to  be  active  if  there  is  a 
node  transmitting  in  it.  We  consider  a  scheduling  u>  that  uses 
the  hexagons  as  the  basic  unit  for  scheduling.  Due  to  the  min¬ 
imum  separation  distance  constraint,  any  two  simultaneously 
active  hexagons  should  be  separated  by  a  minimum  Euclidean 
distance  (depending  on  the  carrier-sensing  range  given  in  (3)). 
Let  k  be  an  integer  and  represent  the  minimum  number  of 
inactive  hexagons  between  two  closest  simultaneously  active 
hexagons  (see  Figure  2).  Any  two  nodes  inside  the  two  active 
hexagons  are  separated  by  a  Euclidean  distance  of  at  least 
\/3 ksn.  With  a  bit  twist  of  terminology,  we  further  define 
a  maximal  independent  set  for  scheduling  to  be  the  set  of 
hexagons  that  a)  includes  as  many  hexagons  as  possible;  and  b) 
closest  hexagons  in  the  set  are  separated  by  exactly  k  adjacent 
hexagons.  Figure  2  illustrates  such  a  maximal  independent  set 
with  k  =  3. 


We  define  u>  such  that  only  hexagons  belonging  to  the  same 
maximal  independent  set  can  be  active  at  the  same  time. 
No  nodes  in  the  same  hexagon  can  be  scheduled  to  transmit 
simultaneously.  (Note  that  if  a  hexagon  intersecting  the  border 


of  y—  Ap,  has  node(s)  in  it,  it  is  also  included  into  the 
maximal  independent  set  and  its  node(s)  are  treated  in  the  same 
way  as  other  nodes  in  interior  hexagons.)  As  a  consequence 
of  the  CSMA  constraint  and  the  definition  of  k, 


V^ksn  >  Rc>  V3  ( k  —  1)  sn  (9) 


B.  Probability  of  having  no  isolated  node 

In  this  subsection,  we  derive  a  lower  bound  on  If,  for  oj 
defined  in  the  previous  subsection.  This  is  done  by  analyzing 
the  event  that  the  network  has  no  isolated  node  under  ui.  The 
following  theorem  summarizes  another  major  outcome  of  the 
paper: 

Theorem  9.  Under  the  same  setting  in  Theorem  5  and  the 
scheduling  algorithm  u),  a  necessary  condition  on  Pu  for  the 
CSMA  network  to  have  no  isolated  node  a.a.s.  as  n  — >  oo  is 

Pu  >  Pti,b%  (logn)7  (10) 

where  62  =  sj 6/5  ( b  —  1)  and  b  is  the  smallest  integer 

t-  ,  ■  .,  ■  2(^(b+l)+l)1_“(U3(a-l)(6+l)  +  l) 

satisfying  the  inequality:  - (b+r)2(«-U(«-2) -  < 

i  (2  7T\  § 

0  \  5  >  ' 

Proof:  The  main  strategy  used  is  to  couple  the  network 
under  the  SINR  model  with  the  associated  network  under 
UDM.  Then,  an  upper  bound  on  the  probability  of  having  no 


yen 
2  ’ 


isolated  node  in  the  network  under  the  SINR  model  is  obtained 
by  using  existing  results  for  UDM. 

Denote  the  Euclidean  distance  between  the  centers  of  two 
closest  hexagons  in  a  maximal  independent  set  by  L  = 
\/3  (k  +  1)  sn.  See  Figure  2  for  an  illustration.  Divide  the 
hexagons  belonging  to  the  same  maximal  independent  set  as 
a  hexagon  hi  into  tiers  of  increasing  Euclidean  distance  from 
the  centre  of  ht  using  a  similar  strategy  as  that  in  the  proof  of 
Theorem  1.  The  mth  tier  of  hj  has  at  most  6 m  hexagons. 
Further,  we  declare  that  the  mth  tier  of  hi  is  complete  in 
a  given  area  if  all  the  6m  hexagons  are  entirely  enclosed 

in  this  given  area.  Denote  by  Ca  a  square 
(0  <  c  <  1  and  the  exact  value  of  c  will  be  decided  later 
in  this  paragraph).  The  hexagon  containing  the  origin  o  has 

'  ’  Ws  V3sn 

a  number  of  /  =  - — f  - —  complete  tiers  in  Ca-  As  c 

increases,  t  increases  as  well.  For  the  hexagons  located  in  Ca 
but  near  the  border  of  Ca,  the  number  of  complete  tiers  in 

the  square  —  ^  j  fit  decreases  with  an  increase  in  c.  We 
choose  the  value  of  c  such  that  each  hexagon  inside  Ca  has  at 
least  t  complete  tiers  in  the  square  —  ^ ^  ,  and  the  value 

of  t  is  maximized.  Let  C"A  be  the  union  of  hexagons  entirely 
contained  in  Ca-  With  a  little  bit  abuse  of  terminology,  we 
use  Ca  ( C'A )  to  denote  both  the  area  itself  and  the  size  of  the 
area.  We  can  obtain 


lim  Sa  =  1. 

n—foo  l~'A 


Consider  an  arbitrarily  node  i  transmitting  inside  a  hexagon  h, 
in  C'A.  If  there  is  no  node  that  can  receive  from  it,  then  node 
i  is  isolated.  Let  Imin  be  the  minimum  interference  that  could 
possibly  be  experienced  by  a  potential  receiver  of  node  i  under 
yj.  Note  that  the  Euclidean  distance  between  the  transmitter 
inside  a  hexagon  in  the  mlh  tier  of  h,  and  the  centre  of  hexagon 
hi  is  less  than  m,L  +  sn  (see  Figure  2).  Using  Lemma  12  gives 

T  6  m,(mL  +  sn)~aP 

<m=l 

6PSn“El=  (VSm  {k  +  1)  + 

6Ps-n«f  l^J  (V3  bJ  (k  +  1)  +  l)  dx(ll) 

6 Psfa  J  x  (^/3x  (k  +  1)  +  1^  dx  (12) 

where  denotes  the  largest  integer  smaller  than  or  equal  to 
x.  (12)  is  obtained  due  to  the  fact  that  x  (\/3x  (k  +  1)  +  l) 

is  a  decreasing  function  when  x  >  -7=- — -  and 

V3(fc+l)(a— 1) 

\fi  (k  +  1)  {a  —  1)  >  1  for  a  >  2  and  k  >  1.  Therefore 
x  (\fix  (k  +  1)  +  l)  is  a  decreasing  function  when  x  >  1. 


> 


Further,  noting  that 
follows  that 

i-t 


lim  t,  = 

n—f  00 


lim 

n—ioo 


V3s. 


=  OO,  It 


lim  6  J  x  ^VSx  (/;+!)  +  1^  dx 


2(V3(fc  +  l)  +  l)1  a  (\/3  (a  —  1)  (fc  +  1)  +  l) 


(k  +  1)  (a  —  1)  (a  —  2) 


=f(k) 
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The  above  equation  implies  that  for  an  arbitrarily  small 
positive  constant  e,  there  exists  a  positive  integer  ne  such  that 
when  n>  ne 

RHS  of  (12)  >  Ps~a  (/  (k)  -  e)  =  Jn  (13) 


Let  d  be  the  Euclidean  distance  between  node  i  and  its 
receiver.  By  (1),  (2),  it  follows  that  only  when  >  P, 

the  transmission  from  node  i  to  its  receiver  could  possibly 
be  successful.  In  other  words,  if  there  is  no  node  within  a 

_  X 

Euclidean  distance  of  R  —  (f3Jn/P)  a  to  node  i,  then  it  is 
isolated. 

Denote  by  M  and  MSINR 
nodes  in  the  CSMA  network  in  the  square 

-l  2 


the  (random)  number  of  isolated 

V fn  y/w] 

2  >  2 


and  in 


C'A  C 


y/n  y/n 
2  ’  2 


respectively.  Denote  by  M 


UDM 


the  (ran- 

rl 2 

dom)  number  of  isolated  nodes  in  an  area  C'A  C 

in  a  network  with  a  total  of  n  nodes  uniformly  i.i.d.  on  the 

1  2 


square 


2  ’  2 


under  UDM  with  the  transmission  range 


R.  Based  on  the  discussion  in  the  last  paragraph  and  using  the 
coupling  technique  [7],  it  can  be  shown  that  Pr  (M  >  1)  > 
Pr  (MSINR  >  l)  >  Pr  (MUDM  >  l).  Consequently, 


Pr  (M  =  0)  <  Pr  (Mudm  =  0) 


(14) 


It  remains  to  find  the  value  of  Pr  (Mudm  =  0).  We  first  con¬ 
sider  a  network  with  a  total  of  n  nodes  distributed  on  a  square 
r  r 

—  2  ’  2  under  UDM  with  a  transmission  range  r  (n).  It 
is  well-known  that  when  the  average  node  degree  in  the  above 
network  equals  to  log?i  +  £  (n)  and  lim  £  (n)  =  C  where  £  is 

n—¥  oo 

a  constant  (£  =  oo  is  allowed),  the  probability  that  there  is  no 
isolated  node  in  the  above  network  asymptotically  converges 
to  e~e  ^  as  n  — >  oo  [7],  [28],  [29],  Further,  it  was  shown  in 
[30]  that  boundary  effect  has  an  asymptotically  vanishingly 
impact  on  the  number  of  isolated  nodes.  Let  Z  be  a  ran¬ 
dom  integer  representing  the  number  of  nodes  located  inside 

Ca  C  •  E(2T)  =  cn  and  Var(Z)  =  cn  (1  —  c). 

Let  Mr^n>  be  the  number  of  isolated  nodes  within  an  area  Ca 
in  the  above  network  with  a  transmission  range  r  (n).  Based 
on  the  above  results,  conditioned  on  that  Z  =  cn  we  have 
(here  we  have  omitted  some  trivial  discussions  involving  the 
situation  that  cn  is  not  an  integer) 


lim  Pr  f  Mr(n)  =  0 

n—¥  oo  V 


Z  =  cn  )  =  e 


p-C 


(15) 


Using  Chebyshev’s  inequality,  for  0  <  <5  <  ^,  we  obtain  that 

lim  Pr  (\Z  —  cn\  >  (cn)^+S^\  <  lim  - - — ^  =  0 

n->oc  V  J  n-¥ oo  ( .  ,  i+5\2 

(M2  ) 

(16) 

Let  /  (n)  =  (cn)2+<5 .  Using  the  following  two  equations: 
log  (n  +  f  (n))  +  C  (n)  =  log  n  +  log  (l  +  + 

C  (n)  and  lim.n^oo  log  (l  +  ^  'j  +  ( (n)  = 

limn-^oo  ( (n)  =  (  and  (15),  it  can  be  shown  that 

lim„^(X)Pr(Mr(”)  =0\Z  =  cn  + f(n))  =  e-ce_<. 


Figure  3.  A  plot  of  the  two  constant  factors  b\  and  62  in  the  upper  bound 
(8)  and  in  the  lower  bound  (18)  when  a  =  4. 


Hence,  for  any  integer  m  satisfying  —f  (n)  <  m  <  f  (n), 
linin^oo  Pr  (M1"!"!  =  0|  Z  =  cn  +  m )  =  e_ce  C .  This 

equation,  together  with  (16),  allows  us  to  conclude  that  when 

/  \  /logn+C(n) 

r  (n)  =  yj 


lim  Pr  (Mr{n)  = 

0)  =  e-ce  C 

(17) 

n—f  00  V 

J 

As  a  result  of  (14),  a 

necessary  condition 

for 

lim  Pr  (M  =  0)  =  1  is  that  lim  Pr  (Mudm  =  0)  =  1. 


Using  the  fact  that  lim  =  1  and  (17),  it  follows  that  a 
necessary  condition  for  the  network  under  the  SINR  model 
to  a.a.s.  have  no  isolated  node  is  that  R  > 

and  ( (n)  — ►  oo  as  n  — >  oo.  As  denoted  R  =  (/3Jn/P)_“, 

together  with  the  value  of  Jn  in  (13)  and  the  value  of  sn  in 

Lemma  8  ,  we  obtain  that  /  (k)  <  j  ( ^  i0gn+”(») )  '  +  £’ 
Letting  n  — >  oo  and  then  e  — >  0  in  the  above  inequality  yields 
/  (k)  <  Based  on  the  above  equation,  together 

with  (3)  and  (9),  Theorem  9  results.  ■ 

The  following  corollary  is  obtained  as  a  ready  consequence  of 
Theorem  9  and  Lemma  (6). 

Corollary  10.  A  necessary  condition  required  for  CSMA 
networks  to  be  a.a.s.  connected  as  n  — >  oo  under  any 
scheduling  algorithm,  i.e.  a  lower  bound  on  Pq,  is  given  by 


P^  >  Plh&2  (log«)?  (18) 

Comparing  the  lower  bound  on  I  f  in  (18)  with  the  upper 
bound  in  (8)  and  noting  that  c  (n)  =  o  (log  n),  it  can  be  shown 
that,  given  an  arbitrary  /3,  the  two  bounds  differ  by  a  constant 
factor  only  as  n  — >  oo.  Figure  3  shows  the  a  plot  of  the  two 
constant  factors,  viz.  h\  and  62.  in  (8)  and  in  (18)  respectively 
as  a  function  of  3  when  a  =  4.  The  curve  representing  b2  is 
a  step  function  due  to  the  granularity  caused  by  the  integer  k 
in  the  scheduling  algorithm  u>. 

VI.  Conclusion  and  Future  Work 

In  this  paper,  we  studied  the  connectivity  of  wireless  CSMA 
networks  considering  the  impact  of  interference.  We  showed 
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that,  different  from  an  ALOHA  network,  the  aggregate  inter¬ 
ference  experienced  by  any  receiver  in  CSMA  networks  is 
upper  bounded  even  when  the  coefficient  7  in  (1)  and  (2) 
equals  to  1. 

An  upper  bound  and  a  lower  bound  were  obtained  on  the 
critical  transmission  power  required  for  having  an  a.a.s. 
connected  CSMA  network.  The  two  bounds  are  tight  and 
differ  by  a  constant  factor  only.  The  results  suggested  that 
any  pair  of  nodes  can  be  connected  for  an  arbitrarily  high 
SINR  requirement  so  long  as  the  carrier-sensing  capability 
is  available.  Compared  with  that  considering  UDM  without 
interference,  the  transmission  power  only  needs  to  be  increased 
by  a  constant  factor  to  combat  interference  and  maintain 
connectivity.  This  is  a  optimistic  result  compared  with  previous 
results  on  the  connectivity  of  ALOHA  networks  under  the 
SINR  model. 

The  gap  between  the  two  bounds  can  be  further  narrowed 
by  considering  more  complicated  geometric  shapes  than 
hexagons.  However  such  improvement  is  possibly  of  minor 
importance.  The  implication  of  the  results  in  this  paper  is  that 
there  exists  a  spatial  and  temporal  scheduling  algorithm  in  a 
large  scale  CSMA  network  that  allows  as  many  as  possible 
concurrent  transmissions,  and  meanwhile,  allows  any  pair  of 
nodes  in  the  network  to  be  connected  under  an  arbitrarily 
high  SINR  requirement.  We  also  introduce  a  hexagon-based 
scheduling  algorithm  that  allows  the  CSMA  network  to  be 
connected.  However,  it  remains  a  major  challenge  to  find  the 
optimum  scheduling  algorithm  that  gives  the  minimum  delay 
and  the  maximum  capacity  under  a  specific  traffic  distribution. 


Appendix  I  Proof  of  Theorem  1 

A  network  on  a  finite  area,  denoted  by  A  C  5ft2,  can  always  be 
obtained  from  a  network  on  an  infinite  area  5ft2  with  the  same 
node  density  and  distribution  by  removing  these  nodes  outside 
A.  Such  removal  process  will  also  remove  all  transmitters 
outside  A.  Therefore  the  interference  at  a  receiver  in  A  is  less 
than  or  equal  to  the  interference  experienced  by  its  counterpart 
in  a  network  in  5ft2.  It  then  suffices  to  show  that  the  interference 
in  a  network  in  5ft2  is  bounded. 

Consider  that  an  arbitrary  receiver  2  is  located  at  a  Euclidean 
distance  ro  from  its  closest  transmitter  w,  which  is  also  the 
intended  transmitter  for  z.  We  construct  a  coordinate  system 
such  that  the  origin  of  the  coordinate  system  is  at  w  and  z  is 
on  the  +y  axis,  as  shown  in  Fig.  4. 

In  a  CSMA  network,  the  distance  between  any  two  concurrent 
transmitters  is  at  least  Rc.  Draw  a  circle  of  radius  Rc/ 2 
centered  at  each  transmitter.  Then  the  two  circles  centered 
at  two  closest  transmitters  cannot  overlap  except  at  a  single 
point.  Therefore  the  problem  of  determining  the  maximum 
interference  can  be  transformed  into  one  that  determining  the 
maximum  number  of  equal-radius  non-overlapping  circles  that 
can  be  packed  into  5ft2.  The  densest  circle  packing,  i.e.  fitting 
the  maximum  number  of  non-overlapping  circles  into  5ft2,  is 
obtained  by  placing  the  circle  centers  at  the  vertices  of  a 
hexagonal  lattice  [31,  p.  8],  as  shown  in  Fig.  4. 


Group  the  vertices  of  the  hexagonal  lattice  into  tiers  of 
increasing  distances  from  the  origin.  The  six  vertices  of  the 
first  tier  are  within  a  Euclidean  distance  Rc  to  the  origin.  The 
6 m  vertices  in  the  mth  tier  are  located  at  distances  within 
((to  —  1)  Rc,  m.Rc\  from  the  origin. 

Let  /|  be  the  interference  caused  by  transmitters,  hereinafter 
referred  to  as  interferers  in  this  section,  above  the  a>axis 
at  node  z.  Using  the  triangle  inequalities  gives  ||£c*  —  z\\  > 
||  07 1|  —  ro  where  xt  is  the  location  of  an  interferer  above  the 
tc-axis.  Among  the  6 m  interferers  in  the  mth  group,  half  of 
them  are  located  above  the  ;/;-axis.  Among  these  interferers 
in  the  mth  group  above  the  a;-axis,  three  of  them  are  at  a 
Euclidean  distance  of  exactly  mRc  from  the  origin  and  the 
rest  3  ( m  —  1)  interferers  are  at  Euclidean  distances  within 
mRc ,  mRc}.  Hence,  we  have 


r  <  V  (  3(m~1)P  ,  3P  ) 

^\{^mRc-r0)a  C mRc-r0)aJ 


(19) 


Look  at  the  first  summation  in  (19).  Let  Um,  m  =  3, . . . ,  00,  be 
random  variables  uniformly  and  i.i.d.  in  [to  —  1/2,  to  +  1/2]. 
It  follows  from  the  convexity  of  3(m-i)P  an(j  Jensen’s 

(^mRc-r0) 

inequality  (used  in  the  second  step)  that 


3  (to-  —  1)  P 
”'-3  mRc  -  r0) 
xp-  3  (E  (Um)  —  1)  P 
m=3^§E(Um)Rc-roy 

<  V°°  e(  —  ~  ^ p  ^ 

2^=3  y^UmRc_royj 

^ 00  rm+1/ 2  3(x-i  )P  dx 

^m=3im-1/2  xRc-r0)a 
/  \/3  \  ~a 

=  3 P  J  (x  -  1)  y—xRc  -  r0J  dx 


(20) 
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4 P  (5# Rc  -  r0 


y  “  (^(3a  — l)i?c  — r0) 


i?2  («  -  1)  (a  -  2) 


<21) 


Likewise,  we  also  have  „  7 — w  <  3PERc  '") - 

’  Z-^m—2  [rnRc—r q)  —  (a—  1)KC 

As  a  result  of  the  last  equation  and  (19),  (21),  (4),  it  follows 
that  1 1  <  Ni  (r 0). 

Now  we  consider  the  total  interference  caused  by  interferers 
below  the  x-axis  at  node  z,  denoted  by  1 2-  Let  x,  be  the 
location  of  an  interferer  below  the  x-axis,  it  follows  from  the 
triangle  inequality  that  ||ccj  —  z\\  >  ||as* ||.  Therefore 


I2  < 


< 


< 


3  P  3(m-l)P\ 
^\(mRc)a  +  mRc)a ) 
3  P  SPd)1-  3  P 
m  +  («-l)P?  +  (V3  Rc)a 
|  3P  (|)  1~a  (3a  —  1) 
(a-1)  (a  —  2)  (V3Rc)a 


(22) 


Combining  I\  <  N\  (ro)  and  (22),  Theorem  1  is  proved. 


Appendix  II  Lemma  12 

Lemma  12  is  needed  in  the  proof  of  Theorem  9.  Theorem  11 
is  used  to  prove  Lemma  12. 

Theorem  11.  (Theorem  1  in  [32])  Let  Vi,V2 be 
j  arbitrary  points  in  3?2.  Let  w\ ,  W2 ,  • . .  Wj  be  j  positive 
numbers  regarded  as  weights  attached  to  these  points,  and 
define  a  position  vector  c  by  El=\  WiVi  =  Wc  where  W  = 
Yf  1  Wi.  Then  for  an  arbitrary  point  z,  the  following  holds: 

Eh  m  IK  -  zf  =  Eh  Wi  lit,*  -  cf  +  W\\z-  cf 

Lemma  12.  Consider  a  triangular  lattice  with  unit  side  length 
and  having  a  vertex  located  at  the  origin  o.  Define  the  Is'  tier 
of  points  to  be  the  six  points  placed  at  the  vertices  of  the 
triangular  lattice  at  a  distance  of  1  to  the  origin  o.  Let  the 
mth  tier  of  points  be  the  6  m  points  placed  at  the  vertices  of  the 
triangular  lattice  located  at  distances  within  (m  —  1,  in]  from 
the  origin  o,  as  shown  in  Figure  5.  The  total  number  of  points 
from  the  1st  tier  to  the  mth  tier  then  equals  to  j  =  3 m  (1  +  m). 
Let  17 1,  V2,  ■  ■  ■  Vj  be  the  location  vectors  of  these  j  points 
and  the  points  are  ordered  according  to  their  distances  to  the 
origin  o  in  a  non-decreasing  order.  For  an  arbitrary  point  z 
located  inside  the  hexagon  formed  by  the  ls/  tier  of  six  points, 
the  following  holds:  E\=i  IK  —  2 II  °  is  minimized  when  z 
is  located  at  the  origin  o. 

Proof:  Now  we  use  Theorem  11  to  prove  Lemma  12. 
Letting  all  attached  weights  Wi  equal  to  1  and  using  Theorem 
11,  for  an  arbitrary  point  z  located  inside  the  hexagon  formed 
by  the  1st  tier  of  six  points,  we  have 

Ell  IK  -  zf  =  Ell  K  -  c||2  +  6  \\z  -  cf  (23) 

where  c  is  given  by  E)li  vi  =  6c.  It  is  clear  that  c  is  the 
centroid  of  the  six  points.  Since  the  hexagon  has  a  unit  side 
length,  ||z7i  —  c||  equals  to  1.  Let  Xi  =  ||i;»  —  z\\  and  y  = 


Figure  5.  An  illustration  of  a  triangular  lattice 


| \z  —  c||.  The  problem  in  Lemma  12  can  then  be  converted  to 
the  following  constrained  minimization  problem: 

minimize  /  (x1; . . . ,  x6)  =  EliK“ 

subject  to  h(x  1, . . . ,  x§)  =  Eli xi  ~  6  —  6 y2  =  0 


where  the  constraint  is  due  to  (23).  Using  the  method  of  La¬ 
grange  multipliers,  we  first  construct  the  Lagrangian  in  the  fol¬ 
lowing:  F  (xi, . . .  x6,  A)  =  /  (xi, . . .  ,x6)  +  Ah(x1,...,x6) 
where  the  parameter  A  is  known  as  the  Lagrange  multiplier. 
Then  find  the  gradient  and  set  it  to  zero:  VP  (x\, . . .  x(j .  A)  = 
f  -axYa~1+2Ax1  f 


— ax6  a  1  +  2Axe 
{  h(x1,x2,...,x6 ) 


=  0.  Solving  the  above  equation. 


it  is  obtained  that  A  =  f  (l  +  U2)  2  and  xi  =  X2  ■  ■  ■  = 

x6  =  (tt)“+2  =  (l  +  f)2  •  Since  xt  =  \\vi  -  z\\  denotes 
the  Euclidean  distance  from  Vi  to  z,  only  when  z  =  c,  we  can 
have  xi  =  X2  =  . .  •  =  X6  =  1.  It  follows  that  the  minimum 
of  f  (x i,X2,  ■  ■  ■  ,Xq)  is  obtained  only  when  z  is  located  at 
the  origin  o.  Further,  for  the  6 m  points  of  the  to*  tier,  using 
the  same  method,  it  can  be  shown  that  E^'E  IK  —  z\\~a  is 
minimized  only  when  z  is  located  at  the  origin  o.  The  result 
follows.  ■ 
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Abstract — In  this  paper,  we  study  the  throughput  of 
interference-limited  three  dimensional  (3D)  CSMA  networks. 
Specifically,  we  consider  a  network  with  a  total  of  n  nodes 
uniformly  i.i.d.  in  a  cube  of  edge  length  n  s.  Further,  CSMA 
random  access  scheme  is  employed  and  the  SINR  model  is  used 
to  simulate  a  successful  transmission.  We  first  give  a  sufficient 
condition  on  the  transmit  power  required  for  the  CSMA  network 
to  be  asymptotically  almost  surely  (a.a.s.)  connected  as  n  — >  oo 
under  the  SINR  model.  Then,  we  demonstrate  constructively  that 

a  throughput  of  0  (  -  r  )  is  obtainable  by  each  node  for 

\  (n  log2  n)  3  J 

an  arbitrarily  chosen  destination. 

I.  Introduction 

Wireless  multi-hop  networks  have  been  increasingly  used 
in  military  and  civilian  applications.  In  many  a  applications, 
the  region  in  which  the  network  is  deployed  is  better  modeled 
by  a  3D  space,  instead  of  a  two  dimensional  (2D)  planar  area. 
Examples  include  a  wireless  network  deployed  across  different 
floors  inside  a  building  connecting  a  variety  of  devices  such 
as  computers,  smart  phones,  sensors  etc,  a  network  formed 
by  Unmanned  Aerial  Vehicles  and  ground  devices  for  recon¬ 
naissance  and  surveillance,  and  underwater  acoustic  sensor 
networks.  Capacity  of  such  networks  is  an  important  problem. 
The  scaling  behavior  of  capacity  when  the  network  becomes 
sufficiently  large  is  of  particular  interest. 

Existing  work  on  the  capacity  of  wireless  multi-hop  net¬ 
works  has  mainly  focused  on  the  analysis  of  2D  networks  [1], 
[2],  including  [1]  which  considered  2D  CSMA  networks.  Lim¬ 
ited  work  has  considered  the  properties  of  3D  networks  where 
centralized/deterministic  scheduling  schemes  like  TDMA  are 
employed  [3],  [4],  On  the  other  hand,  CSMA  schemes,  which 
make  use  of  distributed/randomized  medium  access  protocols, 
has  become  prevailing  with  widespread  adoption.  With  CSMA, 
each  node  checks  the  status  of  the  wireless  channel  before 
sending  a  packet.  If  the  channel  is  idle  (i.e.  no  carrier  is 
detected  within  its  carrier-sensing  range),  then  the  node  starts 
its  transmission,  otherwise,  defers  it,  usually  by  a  random 
amount  of  time,  until  the  channel  becomes  idle  again.  Potential 
transmitters  in  the  vicinity  of  an  active  transmitter  are  kept 
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either  expressed  or  implied,  of  the  Air  Force  Research  Laboratory  or  the  U.S. 
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off.  Wireless  signals  transmitted  at  the  same  time  mutually 
interfere  with  each  other.  The  SINR  (signal  to  interference  plus 
noise  ratio)  model  has  been  widely  used  to  capture  the  impact 
of  interference  on  the  quality  of  a  link  and  a  transmission  is 
considered  to  be  successful  iff  a  minimum  SINR  requirement 
has  been  met  [2],  Therefore,  it  is  natural  to  expect  CSMA 
could  improve  the  network  performance  by  alleviating  the 
interference. 

In  this  paper,  we  consider  a  CSMA  network  with  n  nodes 
uniformly  i.i.d.  in  a  cube  of  edge  length  n s  and  investigate 
the  throughput  of  the  network.  The  contributions  of  this  paper 
are: 

1)  We  derive  an  upper  bound  on  the  interference  experi¬ 
enced  by  any  receiver  in  the  3D  CSMA  network.  Using 
the  result,  we  show  that  for  an  arbitrary  SINR  require¬ 
ment,  there  exists  a  transmission  range  Rq  such  that 
any  two  nodes  are  directly  connected  if  their  Euclidean 
distance  is  less  than  or  equal  to  Rq.  Based  on  that,  we 
give  a  sufficient  condition  on  the  transmit  power  for  the 
CSMA  network  to  be  a.a.s.  connected  under  the  SINR 
model  as  n  — >  oo.  A  connected  network  is  a  prerequisite 
for  the  network  to  achieve  a  non-zero  throughput. 

2)  We  further  show  that  in  the  3D  CSMA  network,  a 

throughput  of  0  (  - - — r  )  is  achievable.  Compared 

^  (re  log2  n)  3  J 

with  the  results  in  [3]  and  [4],  which  showed  that  a 
throughput  of  0  (  — r  I  is  attainable  by  using 

\  (re  log2  re)  3  J 

either  a  deterministic  scheduling  [3],  [4]  or  without 
considering  the  SINR  requirement  for  successful  trans¬ 
missions  [4],  our  result  shows  that  a  throughput  of 

0  (  - 1 — r  )  is  also  attainable  even  when  CSMA  is 

V  ("  log2  n)3  J 

used  and  a  minimum  SINR  is  required. 

The  remainder  of  this  paper  is  organized  as  follows:  Section 
II  reviews  related  work;  Section  III  defines  the  network  and 
metrics  being  investigated;  in  Section  IV,  we  give  a  sufficient 
condition  on  the  transmit  power  to  have  an  a.a.s.  connected 
3D  CSMA  network;  In  Section  V,  we  obtain  a  lower  bound 
on  the  achievable  throughput  of  3D  CSMA  networks;  Section 
VI  concludes  the  paper  and  discusses  future  work. 

II.  Related  work 

Existing  work  on  studying  capacity  problem  focused  mainly 
on  2D  networks.  The  seminal  work  [2]  showed  that  in  a 
network  with  a  total  of  n  nodes  distributed  on  a  disk  of 
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unit  area  under  the  SINR  model,  the  throughput  obtainable 
by  each  node  is  1)  0^-^=|==^  if  nodes  are  randomly  i.i.d. 

and  destination  is  randomly  chosen  for  each  node;  2)  0  (  -)=  j 
if  nodes’  location,  traffic  pattern  and  transmission  range  are 
optimally  arranged.  Since  this  pioneering  work,  extensive 
efforts  have  been  made  to  investigate  the  capacity  in  different 
scenarios.  Significant  outcomes  have  been  achieved  for  both 
static  networks  [5]— [7]  and  mobile  networks  [8],  [9],  The 
paper  [9]  showed  that  mobility  of  nodes  can  be  exploited  to 
significantly  improve  network  capacity  at  the  expense  of  delay. 
Other  work  in  the  area  includes  [10],  [11]  studied  the  capacity 
of  networks  with  infrastructure  support,  and  [11]  showed  that 
randomly  placed  base  stations  can  also  boost  the  throughput. 

For  networks  using  distributed/random  CSMA  scheme,  the 
recent  work  [1]  showed  that  a  throughput  of  can 

be  achieved  in  2D  CSMA  networks  with  randomly  chosen 
destinations.  The  result  is  in  the  same  order  as  the  TDMA 
network  considered  in  [2],  [12],  [13]  studied  the  interactions 
between  the  transmit  power,  the  carrier-sensing  range  and  the 
capacity  in  2D  CSMA  networks. 

Very  limited  work  (see  [3],  [4]  and  references  therein) 
has  studied  the  capacity  of  3D  networks  and  all  focused 
on  networks  employing  deterministic  schedulings.  The  paper 
[3]  considered  a  network  deployed  in  a  sphere  and  showed 

that  a  throughput  of  0^ - - — is  feasible.  A  more 


(n  log2  n)  3 

recent  work  [4]  studied  tlie  capacity  of  3D  networks  under 
two  scenarios,  i.e.  nodes  are  regularly  placed  and  nodes  are 
Poissonly  distributed. 


III.  Network  models  and  preliminaries 
We  consider  a  network  with  n  nodes  uniformly  i.i.d.  in  a 
cube  with  edge  length  ns. 


A.  Interference  model 

Assume  all  nodes  use  a  common  transmit  power  P.  Let 
Xk,  k  €  r,  be  the  location  of  node  k,  where  T  represents  the 
set  of  indices  of  all  nodes  in  the  network.  A  transmission  from 
node  i  to  node  j  is  successful  iff  the  SINR  at  node  j  is  above 
a  threshold  /?,  i.e. 


SINR  (x-i,  Xj) 


_ -PI  (xi,  xj ) _  q 

N0  +  J2  P({xk,Xj) 
her, 


(1) 


where  7)  C  T  denotes  the  subset  of  nodes  transmitting  at  the 
same  time  as  node  i.  I  ( x, .  x  t )  represents  power  attenuation 
from  Xi  to  x:j  and  assumes  a  power-law  form,  i.e., 


e(xi,Xj)  =  \\xi-Xj\\  a  (2) 


where  j|-j|  is  the  Euclidean  norm  and  a  is  the  path-loss  expo¬ 
nent.  We  assume  that  the  background  noise  No  is  negligibly 
small,  i.e.  No  =  0.  This  assumption  is  justified  because 
interference  is  a  major  factor  that  weakens  performance  in 
wireless  networks,  while  the  background  noise  is  typically 
small  and  can  be  combated  by  increasing  the  transmit  power. 
As  commonly  done  in  the  capacity  analysis  [2] — [4],  [9],  [11], 
the  impact  of  small-scale  fading  is  ignored. 


Since  CSMA  typically  require  an  ACK  packet  to  acknowl¬ 
edge  a  successful  transmission,  we  explicitly  consider  bidirec¬ 
tional  link  only  in  the  network.  In  other  words,  a  transmission 
from  node  i  to  node  j  is  successful  iff  both  SINR  ( Xi ,  Xj)  and 
SINR  (xj,  Xi)  are  above  /3.  In  that  case,  we  also  say  that  node 
i  and  j  are  directly  connected. 

B.  Definition  of  throughput 

The  channel  rate  of  a  transmission  from  node  i  to  node  j 
is  related  to  the  associated  SINR  by  Shannon  theorem,  i.e., 

R(x.i,Xj)  =  B  log2  (1  +  SINR  (xi,  Xj))  (3) 

where  B  is  the  bandwidth  of  the  channel  in  Hertz.  Due  to  the 
minimum  SINR  requirement  in  (1),  the  channel  rate  between 
a  pair  of  directly  connected  nodes  is  at  least  B  log2  (1  +  /3). 

Every  node  sends  data  at  a  rate  (bits/sec)  to  a  randomly 
chosen  destination.  A  node  is  both  a  source  and  a  desti¬ 
nation  node  for  another  node.  Therefore  the  total  number 
of  source-destination  pairs  is  n.  The  per-node  throughput , 
denoted  by  A  (n),  is  defined  as  the  maximum  rate  that  could 
be  achieved  by  any  source-destination  pair  simultaneously.  A 
throughput  of  A  (n)  is  feasible  if  there  is  a  temporal  and 
spatial  scheduling  scheme  such  that  every  node  can  send 
A  (n)  bits/sec  on  average  to  its  destination,  i.e.  there  exists 
a  sufficiently  large  positive  number  r  such  that  in  every  finite 
time  interval  \(j  —  l)r,  jr\  every  node  can  send  r A  (?r)  bits 
to  its  destination.  A  throughput  is  of  order  0  (/  (n))  bits/sec 
if  there  are  deterministic  constants  0  <  c  <  c'  <  +oo 
such  that  linin^oo  Pr  (A  (n)  =  cf  (?i)  is  feasible)  =  1  and 
linin^oo  Pr  (A  (n)  =  d f  (n)  is  feasible)  <  1. 

C.  CSMA  random  access  scheme 

In  CSMA  networks,  two  nodes,  say  i  and  j,  are  allowed 
to  transmit  simultaneously  if  they  can  not  detect  each  other’s 
transmission,  i.e.  both  PI(xi,  Xj)  and  PI  (xj,  Xi)  are  under 
a  certain  detection  threshold  If  (this  is  also  termed  as  the 
pairwise  carrier-sensing  decision  model  in  [1]).  This  mech¬ 
anism  imposes  a  minimum  separation  constraint  among  the 
concurrent  transmitters,  known  as  the  carrier-sensing  range. 
It  readily  follows  from  (2)  that  the  carrier-sensing  range  Rc  is 
given  by 

Rc  =  (P/P,h)1/a  (4) 

Under  the  carrier-sensing  constraint,  multiple  nodes  contend 
for  an  opportunity  to  transmit  and  at  a  particular  time  instant, 
there  can  only  be  one  node  transmitting  in  a  geographic  region 
determined  by  Rc.  Therefore,  the  channel  rate  given  by  (3) 
is  shared  by  several  nodes  in  the  vicinity  over  time.  Next, 
we  describe  how  to  obtain  the  time-average  channel  rate  (or 
equivalently  the  long-term  channel  rate  in  [1])  for  each  node. 

Same  as  that  in  [1],  we  consider  an  idealized  CSMA 
scheme.  Assume  that  each  node  maintains  a  countdown  timer , 
which  is  initialized  to  a  non-negative  random  integer.  The 
timer  of  a  node  counts  down  when  the  node  senses  the  channel 
idle,  otherwise  it  is  frozen.  A  node  initiates  its  transmission 
when  its  timer  reaches  zero  and  the  channel  is  idle.  After 
finishing  transmission,  the  node  resets  its  timer  to  a  new 
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random  integer.  The  average  countdown  time  can  be  distinct 
for  different  nodes,  which  can  be  set  to  control  the  state 
transition  probabilities  in  the  next  paragraph. 

The  above  CSMA  scheme  can  be  modeled  by  a  Markov 
chain  with  state  space  S,  where  a  state  S  £  S  represents  the 
active  transmitter  set  at  a  particular  time  instant.  A  transition 
between  two  distinct  states  S,  S'  £  §  can  possibly  occur  iff 
S'  =  {i}UN  for  Vi  £  T.  Transition  S  — >  {*}U<S  (where  i  f.  S) 
represents  the  event  that  node  i  will  starts  its  transmission 
after  its  timer  counts  down  to  zero.  Transition  {i}  U  S  — >  S 
represents  the  event  that  node  i  finishes  its  transmission  and 
hence  become  silent  again.  Let  v  be  the  set  of  state  transition 
probabilities  and  denote  the  above  Markov  chain  by  (S,  v). 
Then  the  time-average  channel  rate  available  for  each  node 
can  be  characterized  by  the  stationary  distribution  of  (S,  v) 
[14]  [1,  Lemma  8], 

IV.  Connectivity  of  3D  CSMA  Networks 

For  any  throughput  to  be  feasible,  a  prerequisite  is  that  there 
exists  a  path  between  each  pair  of  source  and  destination,  i.e. 
the  network  is  connected.  In  this  section,  we  first  derive  an 
upper  bound  on  the  interference  experienced  by  any  receiver 
in  the  network.  We  further  show  that  for  an  arbitrarily  chosen 
/?,  there  exists  a  transmission  range  Rq  such  that  a  pair  of 
nodes  are  directly  connected  if  their  Euclidean  distance  is 
smaller  than  or  equal  to  R0.  Based  on  that,  we  give  a  sufficient 
condition  on  the  transmit  power  for  the  3D  CSMA  network 
to  be  a.a.s.  connected  as  n  — >  oo  under  the  SINR  model. 

The  following  lemma  gives  an  upper  bound  on  the  interfer¬ 
ence. 


Lemma  1.  Consider  a  CSMA  network  with  nodes  arbitrarily 
distributed  in  a  region  in  3?3  where  the  carrier-sensing  range  is 
Rc,  given  by  (4).  Denote  by  ?’0  the  Euclidean  distance  between 
an  arbitrary  receiver  and  its  intended  transmitter  and  r o  <  Rc. 
When  the  path  loss  exponent  a  >  3,  the  maximum  interference 
is  upper  bounded  by  N  (ro),  where 


Rc  -  r0j 

+  9P((^a2-7-fa+89)  R2c-27V6Rcr0  (q-l)+54rg) 

2V6{a-l)(a-2)(a-3)R3c^Rc-roy 

Proof:  See  Appendix.  ■ 

Remark  2.  Note  that  the  upper  bound  in  Lemma  1  applies 
to  arbitrary  node  distribution  in  -ft3.  Further,  the  requirement 
that  a  >  3  is  for  the  interference  to  be  bounded  by  a  constant 
independent  of  n.  If  a  <  3,  then  the  interference  given  by  (13) 
approaches  infinity  as  n  — »  oo.  In  that  case,  an  upper  bound  on 
interference  can  still  be  found  by  using  the  technique  presented 
in  the  proof  of  Lemma  1  but  that  bound  will  be  a  function  of 
n.  In  this  paper,  we  focus  on  the  situation  that  a  >  3  to  avoid 
some  verbose  but  straightforward  discussion  on  special  cases 
that  occur  when  a  <  3. 

Corollary  3  is  a  consequence  of  Lemma  1. 


N(r 0)  =  12P  (Rc-r0)~a  + 17 P  ( ^ 


Corollary  3.  Under  the  same  setting  as  that  in  Lemma  1, 
there  exists  a  transmission  range  Rq  <  Rc  such  that  a  pair 


of  nodes  are  directly  connected  if  their  Euclidean  distance  is 
smaller  than  or  equal  to  Rq,  which  is  given  implicitly  by 

PRfa/N(R0)  =  p  (6) 

Pp  ^  Pt  ^ 

Proof:  Noting  that  ->  oo  as  r0  ->  0,  -A  0 

Pp  ^  •  •  • 

as  ?’o  — ►  Rf  and  that  N(°0)  is  a  monotonically  decreasing 
function  of  ro,  therefore  there  is  a  unique  solution  to  (6).  The 
rest  of  the  proof  is  trivial  and  hence  omitted.  ■ 

Since  P  =  /](,  Rf ,  Ro  in  (6)  can  also  be  expressed  as  a 
function  of  Rc.  Letting  5s  x,  (6)  can  be  rewritten  as 

\  - 

9  (( Q!2— ^  Q! + 89)  a:2— 27\/6  (a-1)  x+54) 

+  ~T~Z  N  a-l  (7) 

2-\/6  (a  —  1)  (a  —  2)  (ct  —  3)  x 3  (  —  lj 

It  follows  that  Rq  =  Pjf  and  b  is  the  solution  to  (7),  which 
depends  on  3  and  a  only.  Equation  (7)  gives  a  more  convenient 
way  to  study  the  relation  between  P  and  Rq. 

Based  on  Corollary  3  and  the  result  in  [15,  Theorem  1]  on 
the  connectivity  of  3D  networks  under  the  unit  disk  connection 
model,  we  obtain  the  following  theorem. 

Theorem  4.  Consider  a  CSMA  network  with  a  total  ofn  nodes 
uniformly  i.i.d.  in  a  cube  with  edge  length  ns.  Under  SINR 
model,  the  network  is  a.a.s.  connected  as  n  — >  oo  if  transmit 
power 

P  =  Pth  (' b')a  (log  n  +  c  (n))  3  (8) 

1 

where  lim  c  (n)  =  +oo,  b '  =  b{ 3/47t)3  and  oo  >  b  >  1  is 

n—foo 

the  solution  to  (7). 

Proof:  The  theorem  readily  follows  from  the  result  in  [15, 
Theorem  1]  with  proper  scaling  and  Corollary  3.  ■ 

An  implication  of  Theorem  4  is  that  when  P  is  set  as  that 
in  (8),  a.a.s.  there  exists  a  temporal  and  spatial  scheduling 
scheme  that  allow  any  pair  of  nodes  in  the  CSMA  network  to 
exchange  packets. 

V.  Feasible  Throughput 

In  this  section,  we  first  describing  a  routing  algorithm 
and  then  show  that  a  throughput  of  0  (  - - — T  )  can  be 

\(n  log2  n)  3  J 

achieved  using  the  routing  algorithm  in  3D  CSMA  networks. 

Partition  the  cube  into  non-overlapping  cubelets  of  edge 
length  sn  =  (41ogn)3.  Let  X,  be  the  random  number  of 
nodes  in  a  cubelet  i.  Let  X  =  maxX,;  and  X  =  rninX,  where 

ie  T  ie  T 

T  represents  the  set  of  indices  of  all  cubelets.  We  obtain: 
Lemma  5.  As  n  — ►  oo,  Pr  (X  <  C\  log  n)  =  1  and 
Pr(X>C2logn)  =  1  where  c\  =  4  ^1  +  and  c-i  = 


Proof:  Note  that  X,  has  a  binomial  distribution  with 
parameters  n  and  —  and  E  [X.(]  =  s3  =  4  log  n.  Us¬ 
ing  the  Chernoff  bound,  we  have  that  for  any  6  £ 

(0,1),  Pr  [Xj  >  (1  +  <5)  E  [Xj]]  <  exp(-^p^J  holds. 
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Figure  1.  An  illustration  on  the  number  of  routes  served  by  one  cubelet. 


Let  S  = 


_  ^3 

2  > 


(i  +  f) 


log  n 


< 


then  Pr  >  4  | 

There  are  a  total  of  p-  cubelets  (here  we  ignored  some 
trivial  discussion  on  granularity  problem  caused  by  -3- 
not  being  an  integer).  By  the  union  bound,  we  have 


lim?7 


Pr 


Urifn  (*i  >  4  (l  +  #)  logn) 


=  0. 


Using  a  similar  method,  we  have  that  for  any  S'  £  (0, 1) 
Pr  [X  <  (1  —  c)')  E  [X]]  <  exp  S  ^  holds.  Taking 

S'  :  and  using  the  union  bound  yields  X .  ■ 


A.  The  maximum  traffic  sen’ed  by  each  cubelet 

For  a  given  /3,  using  (7),  (4),  we  can  choose  a  transmit 
power  so  that  the  transmission  range  R0,  given  by  Corollary  3, 
is  y/6 sn.  This  value  allows  any  two  nodes  in  two  neighboring 
cubelets  to  directly  communicate  with  each  other  under  the 
SINR  model.  Hence,  the  channel  rate  between  two  nodes, 
whose  Euclidean  distance  is  less  than  or  equal  to  Rq,  is  at 
least  i?log2  (1  +  0).  Using  (7)  we  can  write:  Rc  =  b\/f>sn, 
where  b  is  the  solution  to  (7)  for  the  given  0 

We  employ  a  similar  routing  scheme  to  that  used  in  [4].  For 
a  pair  of  source  and  destination  nodes  located  at  ( xs,ys,zs ) 
and  (xd,yd,Zd)  respectively,  packets  generated  by  the  source 
are  first  relayed  to  a  node  closest  to  (xs,  ys,  z,i ) ,  then  to  a  node 
closest  to  (xd,ys,Zd)  and  finally  delivered  to  ( Xd,yd,Zd ). 

As  illustrated  in  Fig.  1,  the  shaded  space  represents  a 
cubelet.  According  to  the  above  routing  scheme,  only  nodes 
located  in  the  three  rectangular  cuboid,  which  are  bounded  by 
dashed  lines  in  the  figure,  possibly  need  nodes  in  the  shaded 
cubelet  to  relay  their  data.  Therefore  the  maximum  number  of 
routes  served  by  each  cubelet  is 


N 

1  v  routes  0  -V  1 

Sn  43 


(9) 


Suppose  each  node  sends  data  at  a  rate  A  (n)  bits/sec  to 
its  destination.  The  maximum  amount  of  traffic  each  cubelet 
needs  to  transmit  is  A7routesA  (n)  at  most. 


B.  Time-average  channel  rate  for  each  node 

In  this  subsection,  we  first  construct  a  deterministic  TDMA 
scheduling  (<S*)t!li>  where  St  £  S  is  the  active  transmitter  set 
during  time  slot  t,  and  determine  the  time-average  channel  rate 
C3et  [(<St)I^_ J  for  node  i  6  F  under  this  scheme.  Then  using 
the  result  in  [1,  Lemma  9],  we  show  that  C3et  < 

C'an  [(§,  u)]  where  C™  [(§,  u)]  is  the  time-average  channel 
rate  for  node  i  under  the  CSMA  scheme  described  in  Section 


III-C  (modeled  by  the  Markov  chain  (S,  v)).  Finally  we 
establish  a  lower  bound  on  Cfn  [(§,  u)]. 

As  shown  in  (3),  any  pair  of  directly  connected  nodes  can 
transmit  at  a  rate  at  least  B  log2  (1  +  0).  For  convenience,  in 
the  following  discussion,  we  consider  that  the  rate  equals  to 
B  log2  (1  +  ff)  and  normalize  it  to  1.  We  divide  time  into  slots 
of  unit  length.  It  follows  that  the  channel  rate  available  for  a 
particular  node  i  under  the  scheduling  scheme  (<St)^=1  is  equal 
to  the  fraction  of  time  that  node  i  gets  to  transmit,  i.e. 

1  m 

cf'VS,)ZA  =  -  IXieS) 

L  t— 1 


We  group  adjacent  cubelets  into  non-overlapping  cubes  and 
each  cube  contains  (k  +  l)3  cubelets,  where  k  =  |~6v/6"|  so 
that  ksn  >  Rc  =  b06sn.  Using  Lemma  5,  a.a.s.  there  are  at 
most  ci  log  n  nodes  in  every  cubelet.  Based  on  the  above  dis¬ 
cussion,  a  deterministic  scheduling  algorithm  can  be  designed 
such  that  within  time  slots  from  t  =  1  to  t  =  (k  +  l)3  ci  log  n, 
each  node  gets  at  least  one  time  slot  to  transmit  while  the 
set  of  concurrent  transmitters  meets  the  CSMA  constraints. 
Denote  by  St  the  concurrent  transmitter  set  during  time  slot 


1,0 fc- 


1)  cilogtr 


and  the 


t.  It  follows  that  St  £  S  for  t  £ 

fraction  of  time  spent  on  each  St.  t  £  |^1,  (k  +  l)13  ci  lognj 

(fc+itx  log  n  ■  Lettins  m  =  (fc  +  l)3cilog?r,  it  then  follows 
that  there  is  a  deterministic  scheduling  that  can  achieve  a  time- 


average  channel  rate  of  at  least 


B  log(l+/3) 
(fc+l)3ci  log 7 


for  node  i,  i.e. 


cf  [W=1]> 


B  log2  (1  +  /g) 

(, k  +  l)3  ci  log?r 


(10) 


Using  the  result  in  [1,  Lemma  9]  which  states  that  there 
exists  a  properly  designed  CSMA  scheme  that  delivers  suitable 
state  transition  probabilities  v,  such  that  for  each  node  i,  the 
following  holds 


CT 


(11) 


where  C3et  [(«St)^=1]  in  (11)  is  the  time-average  channel  rate 
available  for  node  i  under  a  deterministic  scheduling  scheme. 
Combining  (11),  (10)  and  (3),  it  can  be  established  that  for 

node  ier.cr  [(S.  «)1  >  (fffigff-. 

Lemma  5  also  tells  that  the  minimum  number  of  nodes 
in  every  cubelet  is  greater  than  or  equal  to  C2  log  n  a.a.s. 
Therefore,  the  minimum  time-average  channel  rate  available 
for  each  cubelet  under  CSMA  scheme  is 


(c2  log  n)  B  log2  (1  +  (3)  c2B  log2  (1  + /3) 


{k  +  l)3  ci  logn 


ci  (, k  +  l)3 


(12) 


C.  Lower  bound  on  throughput 

For  any  per-node  throughput  A  (n)  to  be  feasible,  the  traffic 
load  for  each  cubelet  should  not  exceed  the  time-average 
channel  rate  available  for  each  cubelet,  i.e., 


^routes  A  (n)  X 


c2B  log2  (1  +/3) 
ci  (k  +  l)3 


bits/sec 


which  results  in  a  lower  bound  on  the  feasible  per-node 
throughput.  This  is  summarized  in  the  Theorem  6,  which  forms 
another  major  contribution  of  this  paper. 
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Figure  2.  Densest  sphere  packing. 


Theorem  6.  For  the  considered  CSMA  networks,  there 
exists  a  deterministic  constant  c  >  0,  independent  of 

a,  and  /3,  such  that  a  per-node  throughput  A  (n)  = 

- — — T  )  bits/sec  is  feasible  a.a.s.  as  n  — >  oo, 

where  k  =  |~6\/6"|  and  b  is  the  solution  to  (7)  for  a  given  /?. 

VI.  Conclusion 

In  this  paper,  we  studied  the  throughput  of  3D  CSMA 
networks.  We  first  provided  a  sufficient  condition  on  the 
transmit  power  for  having  an  a.a.s.  connected  3D  CSMA 
network  under  the  SINR  model.  Then,  using  a  simple  routing 

scheme,  we  obtained  a  per-node  throughput  of  0  (  - - — r  ) 

\  (nlog2  n)  3  J 

is  feasible  even  when  distributed/random  access  scheme  is 
used  and  a  minimum  SINR  for  each  successful  transmission 
is  specified.  It  remains  our  future  work  to  study  the  optimum 
access  scheme  that  maximizes  the  per-node  throughput. 


clog2(1+^)/ 

(fc+i)3  l 


Appendix:  Proof  of  Lemma  1 


The  derivation  of  the  upper  bound  on  interference  is  similar 
to  that  used  in  [16].  The  difference  is  in  that,  here  we  use 
densest  sphere  packing  in  3D  space  to  derive  the  upper  bound. 

Construct  a  coordinate  system  such  that  the  origin  o  is  at  a 
transmitter  w.  Consider  a  transmission  from  w  to  its  receiver 
u  located  at  u  and  define  r0  =  |M|. 

Draw  a  sphere  of  radius  Rc/2  centered  at  each  concurrent 
transmitter.  The  two  spheres  centered  at  two  closest  trans¬ 
mitters  cannot  overlap.  The  maximum  interference  happens 
when  these  spheres  are  placed  in  the  densest  way,  which  is 
to  place  the  sphere  centers  at  the  vertices  of  a  face-centered 
cubic  lattice  [17,  p.9] .  See  Fig.  2  for  an  illustration. 

Group  the  sphere  centers  into  tiers  of  increasing  distance 
from  the  origin.  The  picture  (a)  in  Fig. 2  shows  the  spheres  in 
the  1st  tier  (in  dark  shade)  and  the  2nd  tier  (in  light  shade) 
whose  centers  are  located  on  the  x  —  y  plane.  All  of  the 
spheres  in  the  2nd  tier  whose  centers  are  above  the  x  —  y 
plane  (including  those  on  the  x  —  y  plane)  are  shown  in  light 
shade  in  picture  (c).  Although  Fig. 2  only  shows  the  packing 
along  +z  axis,  the  packing  goes  along  —z  axis  as  well  in  the 
same  way  as  along  +z  axis.  The  number  of  interferes  in  the 
jth  tier  is  27 j2  +  2.  Let  x,  be  the  location  of  an  interferer. 
The  minimum  distance  from  an  interferer  in  the  jth  tier  to  the 
origin  is  jRc .  Denote  by  I  (ro)  the  interference  at  node  z. 
Since  \\xi  —  w||  >  ||a:j||  —  ro,  (2)  we  obtain 


1  M< 


12P  17P 

(Rc-r0)a  + (^Rc-r0) 


^E 


i= 2 


P  (27j2  +  2) 
(# jRc-r0)a 


(13) 


The  first  two  items  in  RHS  of  (13)  account  for  the  interference 
caused  by  the  1st  tier  interferes.  Let  Uj ,  j  =  2, . . . ,  oo,  be 
random  variables  uniformly  and  i.i.d.  in  [ j  —  2 ,  j  +  ] .  It 
follows  from  the  convexity  of  (27 j2  +2)  (f^jRc  —  ro)  " 
when  j  >  2,  a  >  3  and  Jensen’s  inequality  that 

p£(27j2  +  2)  jRc~roya 

3=2 

oo  rp,  _ 

=  Pj2(^(UJ)2  +  2)(V-E(UJ)Rc~r0)  “ 

3=2 

<  PJ2E( (27U3  +  2)  (fYU3Rc  -  ro) 

3=2  ' 

rj+1/ 2  /,/fi 

=  P  E/  (27x2 +  2)  (—xRc- r0)  dx 

j=2Jj-l/2  v  d  ' 

_  9P((^a2-7-fa+89)  R2c-27V6Rcr0  (q-lj  +  Sdrg^ 

2a/6  (a  -  1)  (a  -  2)  (a  -  3)  R3C  (^£RC  -  r0)  “ 

Substituting  (14)  into  (13),  Lemma  1  is  proved. 
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Abstract — In  this  paper,  we  study  the  information  propagation 
process  in  a  1-D  mobile  ad  hoc  network  formed  by  vehicles  Pois- 
sonly  distributed  on  a  highway  and  traveling  in  the  same  direction 
at  randomly  distributed  speeds  that  are  independent  between  ve¬ 
hicles.  Considering  a  model  in  which  time  is  divided  into  time  slots 
of  equal  length  and  each  vehicle  changes  its  speed  at  the  beginning 
of  each  time  slot,  independent  of  its  speed  in  other  time  slots,  we 
derive  analytical  formulas  for  the  fundamental  properties  of  the 
information  propagation  process  and  the  information  propagation 
speed  (IPS).  Using  the  formulas,  one  can  straightforwardly  study 
the  impact  on  the  IPS  of  various  parameters  such  as  radio  range, 
vehicular  traffic  density,  and  time  variation  of  vehicle  speed. 
The  accuracy  of  the  results  is  validated  using  simulations.  The 
research  provides  useful  guidelines  on  the  design  of  vehicular  ad 
hoc  networks  (VANETs). 

Index  Terms — Information  propagation  speed  (IPS),  mobile 
ad  hoc  network,  vehicular  ad  hoc  network  (VANET). 

I.  Introduction 

A  VEHICULAR  ad  hoc  network  (VANET)  is  a  mobile 
multihop  network  formed  by  vehicles  traveling  on  the 
road.  As  a  new  way  of  communication,  VANETs  have  attracted 
significant  interest  in  not  only  academia  but  in  industry  as  well 
[  1  ].  IEEE  has  taken  up  working  on  new  standards  for  VANETs, 
such  as  the  IEEE  1609  Family  of  Standards  for  Wireless  Access 
in  Vehicular  Environments  (WAVE)  [2].  Furthermore,  there 
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Fig.  1.  Topology  of  a  VANET  at  different  time  instances.  In  the  figures  in 
this  paper,  the  positive  direction  of  the  axis  is  the  direction  of  information 
propagation. 

are  many  projects  on  VANETs  such  as  InternetITS  in  lapan, 
Network  on  Wheels  in  Germany,  and  the  PReVENT  project  in 
Europe  [3].  In  this  paper,  we  study  the  expected  propagation 
speed  for  a  piece  of  information  to  be  broadcast  along  the 
road  in  a  VANET,  which  is  referred  to  as  the  information 
propagation  speed  (IPS).  Due  to  the  mobility  of  vehicles,  the 
topology  of  a  VANET  is  changing  over  time.  Furthermore, 
traffic  density  on  a  road  can  significantly  vary,  depending  on 
the  time  of  day  or  day  of  the  week.  Therefore,  the  information 
propagation  process  in  a  VANET  can  be  quite  different  from 
that  in  a  static  network. 

A  VANET  is  usually  partitioned  into  a  number  of  clusters  [4], 
[5],  where  a  cluster  is  a  maximal  set  of  vehicles  in  which  every 
pair  of  vehicles  is  connected  by  at  least  one  multihop  path. 
Due  to  the  mobility  of  vehicles,  the  clusters  are  splitting  and 
merging  over  time.  Therefore,  the  information  propagation  in  a 
VANET  is  typically  based  on  a  store-and-forward  scheme,  same 
as  that  in  a  delay-tolerant  network  [4].  Considering  the  example 
shown  in  Fig.  1,  a  piece  of  information  starts  to  propagate 
from  the  origin  toward  the  positive  direction  of  the  axis  at  time 
to.  The  vehicles  that  have  received  this  piece  of  information 
are  referred  to  as  the  informed  vehicles,  where  other  vehicles 
are  uninformed.  As  indicated  by  the  leftmost  ellipse,  the  first 
informed  vehicle  is  inside  a  cluster  of  four  vehicles  at  time  to- 
At  time  ti,  the  message  is  forwarded,  in  a  multihop  manner, 
to  the  foremost  vehicle  in  its  cluster.  The  propagation  of  the 
message  within  a  cluster,  which  begins  at  f0  and  ends  at  t  \ , 
is  called  a  forwarding  process.  In  a  forwarding  process,  the 
IPS  is  determined  by  the  per-hop  delay  and  the  length  of  the 
cluster.  The  per-hop  delay  /3  is  the  time  required  for  a  vehicle  to 
receive  and  process  a  message  before  it  is  available  for  further 
retransmission  [6].  The  value  of  ()  depends  on  the  practical 
implementation,  and  a  common  assumption  for  the  value  of 
/ 3  reflecting  typical  technology  is  4  ms  [6].  We  show  that  the 
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per-hop  delay  has  a  significant  impact  on  the  IPS,  particularly 
when  the  vehicle  density  is  high. 

Define  the  head  at  time  t  to  be  the  informed  vehicle  with 
the  largest  coordinate  at  time  t.  Define  the  tail  at  time  t  to  be 
the  uninformed  vehicle  with  the  smallest  coordinate  at  time  t. 
Two  vehicles  can  directly  communicate  with  each  other  if  and 
only  if  their  Euclidean  distance  is  smaller  than  or  equal  to  the 
radio  range  r(l .  Although  this  so-called  unit  disk  model  is  a 
simplified  model,  it  can  be  indicative  for  real-world  scenarios. 
A  realistic  radio  model  usually  takes  into  account  statistical 
variations  of  the  received  signal  power  around  its  mean  value 

[7] ,  It  is  shown  in  [7]  that  these  variations  can  actually  increase 
the  connectivity  of  a  network.  Therefore,  the  analysis  under 
the  unit  disk  model  provides  a  conservative  estimate  on  the 
performance  of  a  VANET.  (We  further  explore  this  issue  in 
Section  VIII-C.)  As  shown  in  Fig.  1,  at  time  t\,  the  tail  is  outside 
the  radio  range  of  the  head.  Then,  a  catch-up  process  begins, 
during  which  the  informed  vehicles  hold  the  information  until 
the  head  catches  up  with  the  tail  (at  time  f2)-  We  study  both  the 
forwarding  process  and  the  catch-up  process  in  this  paper. 

The  main  contributions  of  this  paper  are  given  as  follows: 

1)  Analytical  results  on  the  distribution  of  the  time  required  for 
a  catch-up  process  are  provided.  The  impact  of  vehicle  density, 
vehicle  speed  distribution,  and  vehicle  speed  variation  over  time 
is  considered,  where  previous  research  (e.g.,  [5]  and  [8])  failed 
to  consider  the  impact  of  the  time-varying  vehicle  speed,  which 
results  in  an  unrealistic  model  according  to  traffic  theory  [9]. 

2)  A  first  passage  phenomenon,  which  will  be  introduced  later, 
is  considered  in  the  analysis.  This  first  passage  phenomenon  is 
a  major  technical  hurdle  in  the  accurate  analysis  of  the  catch-up 
process  when  vehicle  speeds  are  allowed  to  change  over  time. 
To  the  best  of  our  knowledge,  this  work  is  the  first  attempt 
to  consider  the  impact  of  the  first  passage  phenomenon  on 
the  information  propagation  in  VANETs.  3)  The  forwarding 
process  is  studied,  taking  into  account  the  per-hop  propagation 
delay  and  packet  collision,  which  give  an  upper  bound  on  the 
IPS  that  was  not  recognized  by  previous  research  (e.g.,  [5], 

[8] ,  and  [10]).  It  is  shown  that  the  per-hop  delay  and  packet 
collision  have  significant  impacts  on  the  IPS,  particularly  when 
the  vehicle  density  is  high.  Finally,  a  closed-form  equation  is 
derived  for  the  distribution  of  the  length  of  a  cluster,  where,  in 
previous  research,  only  numerical  solutions  [3]  or  approximate 
results  [5]  were  provided.  Based  on  the  preceding  results,  the 
analytical  results  for  the  IPS  are  derived,  including  the  impact 
of  various  parameters  such  as  radio  range,  vehicular  traffic 
density,  and  the  time  variation  of  vehicle  speed.  The  results  in 
this  paper  provide  useful  guidelines  on  the  design  of  a  mobile 
VANET. 

The  rest  of  this  paper  is  organized  as  follows:  Section  II 
reviews  the  related  work.  Section  III  introduces  the  mobility 
model  and  network  model  used  in  this  paper.  The  analysis  on 
the  catch-up  process  for  a  generic  speed  distribution  is  given  in 
Section  IV,  followed  by  the  analysis  on  the  catch-up  process  for 
the  Gaussian  speed  distribution  in  Section  V.  In  Section  VI,  the 
analysis  on  the  forwarding  process  is  provided,  including  the 
analysis  on  the  cluster  length.  Based  on  the  preceding  results, 
the  IPS  is  derived  in  Section  VII.  Finally,  Section  IX  concludes 
this  paper. 


II.  Related  Work 

In  recent  years,  VANETs  have  attracted  significant  interest 
due  to  their  large  number  of  potential  applications  [1],  In  [11], 
Fracchia  and  Meo  introduced  the  design  of  a  warning  delivery 
service  in  VANETs.  They  studied  the  propagation  of  a  warning 
message  in  a  1-D  VANET,  where  vehicles  move  in  the  oppo¬ 
site  direction  from  the  propagating  direction  of  the  warning 
message.  However,  their  analysis  is  based  on  an  oversimplified 
assumption  that  the  topology  does  not  change  over  time  during 
the  information  propagation  process.  In  [12],  Camara  et  al. 
studied  the  propagation  speed  of  public  safety  warning  mes¬ 
sages  in  a  VANET  when  infrastructures  (road  side  units)  are 
destroyed  by  natural  disasters  such  as  flooding  and  earthquakes. 
Through  simulation,  they  showed  that  using  vehicles  as  virtual 
road  side  units  can  significantly  speed  up  the  warning  message 
distribution  process,  compared  with  traditional  emergency  alert 
systems  that  rely  on  infrastructures. 

The  IPS  is  an  important  performance  metric  for  VANETs, 
particularly  for  the  safety  messaging  applications  [12],  [13].  In 
[6],  Wu  et  al.  studied  the  IPS  through  simulations.  They  used 
a  commercial  microscopic  traffic  simulator,  i.e.,  CORSIM  [14], 
to  simulate  the  traffic  on  a  highway.  Then,  the  topology  data 
from  CORSIM  were  imported  into  a  wireless  communication 
simulator  to  study  the  properties  of  the  information  propagation 
process.  They  showed  that  the  IPS  significantly  varies  with 
vehicle  densities. 

There  are  analytical  studies  on  VANETs  based  on  the  as¬ 
sumption  that  the  vehicle  speed  does  not  change  over  time, 
which  is  referred  to  as  the  constant  speed  model.  In  [3], 
Yousefi  et  al.  provided  analytical  results  on  the  distribution  of 
intervehicle  distance  in  a  1-D  VANET  under  the  constant-speed 
model  and  the  Poisson  arrival  model:  In  the  Poisson  arrival 
model,  the  number  of  vehicles  passing  an  observation  point 
on  the  road  during  any  time  interval  follows  a  homogeneous 
Poisson  process  with  intensity  A.  By  applying  the  results  used 
in  studying  the  busy  period  in  queuing  theory,  they  further  an¬ 
alyzed  the  connectivity  distance,  which  is  a  quantity  similar  to 
the  cluster  length  to  be  introduced  later  in  this  paper.  However, 
they  did  not  provide  a  closed-form  formula  for  the  distribution 
of  the  cluster  length.  In  [8],  Agarwal  et  al.  studied  the  IPS  in 
a  1-D  VANET  where  vehicles  are  Poissonly  distributed  and 
move  at  the  same  speed  but  in  either  the  positive  or  the  negative 
direction  of  the  axis.  They  derived  the  upper  and  lower  bounds 
for  the  IPS,  which  provided  a  hint  on  the  impact  of  vehicle 
density  on  the  IPS.  However,  the  bounds  are  not  tight,  and  many 
factors,  e.g.,  time  variation  of  speeds  and  propagation  delay, 
were  ignored  in  their  analysis.  In  [5],  Wu  et  al.  considered  a 
1-D  VANET  where  vehicles  are  Poissonly  distributed  and  the 
vehicle  speeds  are  uniformly  distributed  in  a  designated  range. 
They  provided  a  numerical  method  to  compute  the  IPS  under 
two  special  network  models,  i.e.,  when  the  vehicle  density  is 
either  very  low  or  very  high,  which  are  obviously  oversimplified 
models  [4],  [9].  The  aforementioned  studies  [3],  [5],  [8]  were 
all  based  on  the  constant  speed  model.  In  this  paper,  the  time 
variation  of  vehicle  speed  is  considered  and  is  shown  to  have 
a  significant  impact  on  the  information  propagation  process  in 
VANETs. 
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III.  System  Model 

A.  Mobility  Model 

A  synchronized  random  walk  mobility  model  is  considered 
in  this  paper.  Specifically,  time  is  divided  into  time  slots  of 
equal  length  r.  Each  vehicle  randomly  chooses  its  new  speed  at 
the  beginning  of  each  time  slot,  independent  of  other  vehicles 
and  its  own  speed  in  other  time  slots,  according  to  a  certain 
distribution  with  a  mean  value  E[v\.  It  is  shown  later  that  the 
constant-speed  model  forms  a  special  case  in  the  aforemen¬ 
tioned  mobility  model  when  r  — »  oo.  Due  to  a  limited  speed 
acceleration,  the  vehicle  speed  in  the  real  world  does  not  change 
as  rapidly  as  in  the  aforementioned  mobility  model.  Therefore, 
as  will  be  shown  in  Section  VIII-C,  the  results  based  on  the 
aforementioned  model  provide  a  conservative  estimate  on  the 
IPS,  which  is  desirable  for  the  safety  messaging  applications 
(e.g.,  [11]  and  [12]).  Furthermore,  we  also  discuss  the  impact 
of  a  nonsynchronized  mobility  model  at  the  beginning  of 
Section  V. 

In  the  preceding  model,  the  speed  of  a  vehicle  can  be 
considered  as  having  a  constant  component  E[v\  and  a  variable 
component  with  zero  mean.  Accordingly,  the  vehicular  network 
can  be  decomposed  into  two  components:  1)  a  network  in  which 
all  vehicles  travel  at  a  constant  speed  and  2)  a  network  in 
which  vehicles  travel  at  speeds  following  the  same  prescribed 
distribution  fv(v)  with  zero  mean.  Our  analysis  focuses  on  the 
IPS  in  the  second  network  component,  where  fv  (v)  is  the  prob¬ 
ability  density  function  (pdf)  of  the  speed  distribution.  The  first 
network  component  is  separately  considered  and  is  combined 
into  the  result  at  the  end  of  the  analysis.  Furthermore,  a  positive 
(negative)  value  of  the  speed  means  that  the  vehicle  is  traveling 
in  the  same  (opposite)  direction  as  the  direction  of  information 
propagation.  Therefore,  when  E[v\  is  a  positive  (negative) 
value,  our  analysis  provides  the  results  for  a  VANET  in  which 
a  message  propagates  in  the  same  (opposite)  direction  as  the 
direction  of  the  vehicle  traffic  flow.  In  this  paper,  the  analysis  is 
first  performed  for  a  generic  speed  distribution.  Then,  detailed 
analytical  results  are  given  for  the  Gaussian  speed  distribution 
with  standard  deviation  a,  which  is  commonly  used  for  the 
VANETs  on  a  highway  [3],  [9],  [15], 

The  aforementioned  speed-change  time  interval  r  depends 
on  practical  conditions,  e.g.,  a  sports  car  may  more  frequently 
change  its  speed  than  a  heavy  truck.  Reasonable  values  for  the 
time  interval  can  be  from  1  to  25  s  [16].  The  vehicle  mobil¬ 
ity  parameters,  i.e.,  E[v],  a,  and  r,  are  taken  from  practical 
measurements.  Typical  values  for  E[v ]  and  a  are  given  in 
[15],  where  the  usual  record  time  intervals  for  a  vehicle  speed 
monitor  are  r  =  1  s,  5  s  [17],  We  conduct  our  analysis  in  the 
discrete-time  domain  ( t  —  ir)  to  obtain  closed-form  analytical 
equations,  which  give  better  insight  into  the  impact  of  differ¬ 
ent  parameters  on  the  IPS.  Extension  to  the  continuous-time 
domain  is  straightforward,  following  the  procedure  outlined  in 
this  paper. 

B.  Network  Model 

We  adopt  a  commonly  used  traffic  model  in  vehicular  traffic 
theory  [9]  in  which  vehicles  independently  travel  in  the  same 


direction  on  a  1-D  infinite  line  and  follow  the  Poisson  arrival 
model  with  a  rate  A  veh/s.  The  Poisson  arrival  model  is  a  com¬ 
monly  used  traffic  model  in  vehicular  traffic  theory  based  on 
real-world  measurements  [9].  Furthermore,  the  Poisson  arrival 
model  and  the  Poisson  distribution  of  the  vehicles  are  also  com¬ 
monly  used  traffic  models  in  studies  of  VANETs  [3],  [8],  [11]. 

The  following  lemma  relates  the  spatial  distribution  of  the 
vehicles  on  the  road  to  the  Poisson  arrival  model  of  the  vehicles. 
The  result  on  the  spatial  distribution  of  vehicles  is  used  in  the 
rest  of  this  paper. 

Lemma  1:  If  the  traffic  in  a  VANET  follows  the  Poisson 
arrival  model  with  rate  A  and  the  speed  of  each  vehicle  changes 
at  the  beginning  of  each  time  slot,  independent  of  other  vehi¬ 
cles,  according  to  fv(v),  then  at  any  time  instant,  the  spatial 
distribution  of  the  vehicles  on  the  road  follows  a  homogeneous 
Poisson  process  with  intensity  p  =  A  (fv{v)/v)dv. 

Proof:  It  has  been  shown  in  [3]  that,  if  the  vehicle  speeds 
do  not  change  over  time,  then  at  any  time  instant,  the  distances 
between  adjacent  vehicles  (intervehicle  distance  l)  are  indepen¬ 
dent  and  follow  an  exponential  distribution  with  rate  parameter 
p  =  A  fx  (fv(v)/v)dv.  It  follows  that  the  spatial  distribution 
of  the  vehicles  follows  a  homogeneous  Poisson  process  with 
intensity  p. 

Next,  we  apply  the  mathematical  induction  to  study  the 
spatial  distribution  of  the  vehicles  under  our  mobility  model, 
in  which  vehicles  are  allowed  to  change  their  speeds  from  one 
time  slot  to  another.  In  the  first  time  slot  [0,  r),  it  is  straightfor¬ 
ward  to  show  that  the  spatial  distribution  of  the  vehicles  follows 
a  homogeneous  Poisson  process  with  intensity  p,  because  the 
speed  does  not  change  during  a  time  slot. 

Assume  that,  in  the  ith  time  slot  [(i  —  1  )r,  ir),  the  spatial 
distribution  of  the  vehicles  still  follows  a  homogeneous  Poisson 
process  with  intensity  p.  When  the  next  time  slot  begins,  each 
vehicle  chooses  its  new  speed,  independent  of  other  vehicles, 
according  to  fv{y).  Then,  according  to  the  random  splitting 
property  of  a  Poisson  process  [18],  the  spatial  distribution  of  the 
vehicles  traveling  at  the  speed  v  £  [vm,  vm  +  dvm)  during  the 
i  +  1th  time  slot  [ ir ,  [i  +  l)r)  follows  a  homogeneous  Poisson 
subprocess  with  intensity  pfv(vm)dvm.  These  vehicles  have 
the  same  speed  so  that,  at  any  time  instant  during  the  i  +  1th 
time  slot,  the  spatial  distribution  of  these  vehicles  does  not 
change.  Then,  according  to  the  random  coupling  property  of  a 
Poisson  process  [18],  the  spatial  distribution  of  all  the  vehicles 
in  the  i  +  1th  time  slot  follows  a  homogeneous  Poisson  process, 
which  is  the  sum  of  all  the  subprocesses,  with  the  intensity 

I-oc  pfv(vm)dvm  =  p.  ■ 

Note  that  previous  research  [3]  only  considered  that  the 
vehicle  speeds  do  not  change  over  time.  In  Lemma  1,  we 
consider  the  time  variation  of  vehicle  speeds  in  the  analysis  of 
vehicle  distribution  on  the  road. 

IV.  Catch-Up  Process  for  a  Generic 
Speed  Distribution 

In  this  section,  we  study  the  catch-up  process  in  a  VANET 
where  the  vehicle  speeds  follow  a  generic  pdf  fv(v).  Without 
loss  of  generality,  it  is  assumed  that  the  catch-up  process  starts 
at  time  0.  The  displacement  a;  of  a  vehicle  at  time  t,  is  defined 
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Fig.  2.  VANET  at  the  beginning  of  a  catch-up  process  with  gap  lc,  where  lc 
is  the  Euclidean  distance  between  the  head  and  the  tail  at  time  0.  Hereinafter,  a 
catch-up  process  where  the  distance  between  the  head  and  the  tail  is  lc  at  time 
0  is  referred  to  as  a  catch-up  process  with  gap  lc. 

as  the  difference  between  the  position  of  the  vehicle  at  time  0 
and  its  position  at  time  t. 

A.  Modeling  the  Movement  of  a  Single  Vehicle 

Denote  by  p(x,  t )  the  probability  that  the  displacement  of  a 
vehicle  is  x  at  time  t.  Because  the  speed  does  not  change  during 
a  time  slot,  p(x,  r)  can  be  easily  obtained  from  fv(v). 

Due  to  the  independence  of  the  vehicle  speeds  in  different 
time  slots  (hence,  the  displacements),  we  have,  for  t  =  ir 

i— fold  convolution 

/ - - - - V 

p(x,  t )  =  p(x,  it)  =  (p  *  p  *  •  •  •  *  p)(x,  t)  .  (1) 

The  calculation  of  the  aforementioned  /‘-fold  convolution 
can  be  simplified  by  using  the  Fourier  and  inverse  Fourier 
transform. 

B.  Modeling  the  Movement  of  the  Head  and  the  Tail 

Denote  by  Hm  ( Pm )  the  mth  vehicle  to  the  left  of  the  head 
H0  (to  the  right  of  the  tail  P0 )  at  time  0,  as  shown  in  Fig.  2. 
Define  wm  to  be  the  Euclidean  distance  between  //„,  and  Hq  at 
time  0. 

Let  us  first  consider  the  movement  of  the  head.  Define  xm  (t) 
to  be  the  displacement  of  Hm  at  time  t.  Define  y{t)  to  be  the 
displacement  of  the  head  at  time  t.  Note  that  the  head  vehicle 
at  time  0  is  not  necessarily  the  head  vehicle  at  time  t,  because 
the  original  head  may  be  overtaken  by  another  informed  vehicle 
during  time  (0,  t\.  It  follows  that 

y(t)  =  max {xo(t),xi(t)  -  w1,x2(t)  -  w2,  ■  ■  .,xn(t)  -  wn} 

(2) 

where  n  is  the  number  of  vehicles  to  the  left  of  the  head  that 
have  the  potential  to  overtake  the  head  vehicle  at  time  0. 

Because  the  movement  of  a  vehicle  is  independent  of  other 
vehicles,  xm(t)  and  Xj(t)  are  independent  for  any  mf=j. 
Therefore,  the  cumulative  distribution  function  (cdf)  of  the 
displacement  of  the  head  at  time  t  is 

n 

Pr (y(i)  <  y)  =  Pr(xm(t)  -  wm  <  y) 

m= 0 


where  p(x,  t)  is  given  by  (1),  wm  is  the  distance  between  Hm 
and  Hq  at  time  0,  wo  =  0,  and  fWrn(wm)  is  the  pdf  of  wm. 


f 


lc 

A. 


Fig.  3.  Displacements  of  the  head  and  the  tail  at  time  £  in  a  catch-up  process 
with  gap  lc.  The  reduction  of  distance  is  z  =  y  —  y. 


As  an  easy  consequence  of  the  Poisson  distribution  of  the 
vehicles  (proved  in  Lemma  1),  the  intervehicle  distance  follows 
an  exponential  distribution.  Therefore,  we  have 


fwm  (tnm) 


pe 


'(fnvmY 

(to  —  1)! 


for  to  >  1. 


(4) 


Define  ph{y ,  t)  to  be  the  probability  that  the  displacement  of 
the  head  is  y  at  time  t.  Then 


Ph(y,t) 


dPr  (y(t)  <  y) 
dy 


(5) 


The  calculation  is,  however,  tedious  for  a  generic  speed 
distribution.  Therefore,  only  the  methodology  for  the  analysis 
of  the  catch-up  process  is  shown  in  this  section.  A  detailed 
analytical  result  for  the  catch-up  process  under  the  Gaussian 
speed  distribution  is  shown  in  Section  V. 

Denote  by  pg(y,t)  the  probability  that  the  displacement  of 
the  tail  is  y  at  time  t.  The  analysis  for  the  movement  of  the  tail 
is  similar  to  that  for  the  head  and  is,  therefore,  omitted. 


C.  Catch-Up  Delay 

As  shown  in  Fig.  2,  consider  a  catch-up  process  where  the 
Euclidean  distance  between  the  head  and  the  tail  is  lc  at  the  be¬ 
ginning  of  the  catch-up  process  (time  0),  which  is  referred  to  as 
a  catch-up  process  with  gap  lc.  Define  the  catch-up  delay  tc  to 
be  the  time  taken  from  time  0  until  the  time  when  the  head  and 
the  tail  move  into  the  radio  range  of  each  other  for  the  first  time, 
i.e.,  f  2  —  t-[  in  Fig.  1.  We  do  not  consider  the  rare  event  that  the 
distance  between  the  head  and  the  tail  becomes  larger  than  ro 
again  during  (tc,  tc  +  (f),  which  may  cause  the  transmission  of 
a  packet  being  interrupted,  because  the  per-hop  delay  (3  (e.g., 
4  ms  [6])  is  usually  much  smaller  than  the  time  interval  for  a 
vehicle  to  change  speed  (typically  longer  than  a  second  [16]). 
It  is  worth  noting  that  there  is  a  first  passage  phenomenon  in 
the  catch-up  process,  i.e.,  the  catch-up  process  finishes  as  soon 
as  the  head  and  the  tail  move  into  the  radio  range  of  each 
other.  Therefore,  the  catch-up  delay  is  tc  if  and  only  //(iff)  the 
distance  between  the  head  and  the  tail  reduces  from  lc  at  time 
0  to  tq  for  the  first  time  at  time  tc.  This  first  passage  phenome¬ 
non  is  essential  for  the  analysis  of  the  catch-up  process. 

Denote  by  ph{z,  t)  the  probability  that  the  reduction  of  the 
Euclidean  distance  between  the  head  and  the  tail  is  z  at  time  t, 
with  regard  to  their  original  distance  at  time  0.  As  shown  in 
Fig.  3,  it  can  be  shown  that 

OO 

pH{z,t)=  /  ph(y,  t)pg(y  —  z,  t)dy. 


(6) 
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Fig.  4.  Cass  of  random  walks  taking  i'  time  slots  to  walk  from  0  to  z'  through 
an  intermediate  point  z. 

Note  that  the  aforementioned  equation  can  be  converted  to 
convolution  if  pg[y,t)  =  pg(—y,t),  which  is  the  case  to  be 
introduced  in  the  next  section  when  the  vehicle  speed  follows  a 
Gaussian  distribution. 

Denote  by  h(z,  i )  the  probability  that  the  reduction  in  the 
distance  between  the  head  and  the  tail  reaches  z  in  the  vth  time 
slot  [(z  —  1)t,  zt).  Therefore 

IT 

h(z,i)  =  J  pH(z,  t)dt.  (7) 

(*— 1)t 


Then,  by  inverse  Z-transform,  we  can  obtain  £(z,  z).  Denote 
by  F^{z.  i )  the  cdf  of  £  (z,  i)  with  regard  to  i,  i.e.,  the  probability 
that  the  reduction  in  the  distance  between  the  head  and  the  tail 
has  reached  z  during  time  (0,  zt].  It  follows  that  F^(lc  —  ro,  i) 
is  the  probability  that  the  head  and  the  tail  have  moved  into  the 
radio  range  of  each  other  during  time  (0,  ir].  Therefore,  the  ex¬ 
pected  catch-up  delay  (tc)  for  a  catch-up  process  with  gap  lc  is 

OO 

E[tc\le]  =r^(l-Fl(ic-r0li)).  (12) 

i=l 

D.  Distribution  of  the  Gaps  lc 

Denote  by  ffil)  the  pdf  of  the  Euclidean  distance  between 
any  two  adjacent  vehicles.  Due  to  the  Poisson  distribution  of 
vehicles,  it  is  evident  that  /;(()  =  pe~pl.  Denote  by  fic(lc)  the 
pdf  of  the  Euclidean  distance  between  any  two  adjacent  but 
disconnected  vehicles.  It  is  straightforward  that,  for  lc  >  ro 


To  obtain  a  closed-form  result,  the  probability  of  the  reduc¬ 
tion  in  the  distance  between  the  head  and  the  tail  being  z  at  time 
t  £  [(z  —  l)r,  ir)  is  considered  to  be  approximately  equal  to  the 
probability  of  the  reduction  of  the  distance  between  the  head 
and  the  tail  being  z  at  time  t  —  ir.  This  approximation  provides 
a  fairly  accurate  result  when  r  is  small  (e.g.,  r  =  1  s,  5  s),  as 
shown  in  Sections  VIII-A  and  C.  Therefore,  from  (7) 

IT 

h(z,i)=  J  pH(z,  t)dt  ~  rpH(z,  ir).  (8) 
(*— 1)t 

Define  £(z,  z)  to  be  the  first  passage  probability  [19]  of 
h(z,i),  viz.,  the  probability  that  the  reduction  in  the  distance 
between  the  head  and  the  tail  reaches  z  in  the  7th  time  slot 
[(z  —  1)t,  ir)  for  the  first  time  since  time  0.  The  relationship 
between  £(z,i)  and  h(z,i)  can  be  studied  as  the  first  passage 
time  in  a  stochastic  process  [20].  The  first  passage  time  of  a 
diffusing  particle  or  a  random  walker  is  the  time  at  which  the 
particle  or  the  random  walker  first  reaches  a  specified  site. 

A  standard  procedure  is  applied  to  determine  the  first  passage 
probability  £(z,z)  [19],  [20].  As  shown  in  Fig.  4,  consider  a 
class  of  random  walks  starting  at  time  0,  and  walking  from  point 
0  to  z'  must  proceed  by  going  through  a  point  z.  The  transition 
from  0  to  z'  can  be  decomposed  into  two  independent  stages: 
in  the  first  stage,  an  agent  walks  from  0  to  z  for  the  first  time 
in  the  *th  time  slot;  in  the  second  stage,  the  agent  walks  from  z 
to  z'  in  the  i!  —  zth  time  slot,  not  necessarily  for  the  first  time. 
Then,  we  can  obtain  the  following  [19],  [20]: 

i' 

h(z':  *')  =  £(-’  i')h(z>  ~  z^'  -  *)•  (9) 

4=1 

The  convolution  can  be  simplified  by  the  Z-transform  with 
regard  to  z,  which  is  denoted  by  the  operator  Z.  According  to 
the  convolution  theorem,  (9)  becomes 

(Zh){z',  s)  =  (Ztf)(z,  s){Zh){z'  -  z,  s)  (10) 
and  thus  (Zfl(z,«)  =  ( 


fiMc) 


Me) 

i  -  f0r°  mdi 


pe-p(l  e-Vo). 


(13) 


V.  Catch-Up  Process  for  a  Gaussian 
Speed  Distribution 

In  this  section,  we  provide  detailed  analytical  results  on  the 
catch-up  process  under  the  Gaussian  speed  distributions,  which 
is  a  commonly  used  assumption  for  the  VANETs  on  a  highway 
[3],  [9],  [15].  The  procedure  of  the  analysis  is  the  same  as  that 
introduced  in  the  previous  section,  except  for  some  adjustments 
to  obtain  a  simpler  result. 

For  a  zero-mean  Gaussian  speed  distribution  with  standard 
deviation  a,  the  pdf  of  the  vehicle  speed  is 

/"(”,  =  ^exp(5u)'  (l4) 

At  the  end  of  the  first  time  slot,  i.e.,  t  =  t,  it  is  straightfor¬ 
ward  to  show  that 

p(x,t)= — ~t==  exp  (  X  )  .  (15) 

CTTV27T  \2(crr)  / 

Furthermore,  because  the  convolution  of  two  Gaussian  func¬ 
tions  is  another  Gaussian  function  [21],  using  (1),  we  can  obtain 

p(,,-iT)  =  ^exp(^f)  (16) 

where  of  =  z(crr)2. 

Next,  we  consider  the  situation  in  which  vehicles  are  allowed 
to  change  speed  at  different  time  instants.  Without  loss  of 
generality,  consider  that  a  vehicle  changes  its  speed  at  time  r0 
for  the  first  time  since  time  0,  where  To  is  uniformly  distributed 
in  (0,  t\.  Therefore,  for  t  =  ir,  (1)  becomes 

p{x,t)  =p{x,ir ) 

(i— 1)— fold  convolution 

/ - A - - s 

=  p( X,  To)  *  (p  *  p  *  •  •  •  *  p)  (x,  t)  *p(x,  T  —  To) 

=^exp(w) 


(17) 
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where  of  =  (i  —  1)ct2t2  +  ct2Tq  +  ct2(t  —  r0)2  =  icr2T2  + 
2 &2Tq  —  2ct2tt0. 

Compared  with  the  result  using  a  synchronized  mobility 
model  [i.e.,  (16)],  the  additional  terms  are  2 <j2Tq  —  2o2ttq.  To 
simplify  the  analysis,  we  ignore  the  additional  terms  and  con¬ 
sider  the  synchronized  mobility  model  only.  The  error  caused 
by  ignoring  these  additional  terms  and  assuming  a  synchronized 
mobility  model  is  given  by  JJ(2(t2Tq  —  2a-2TTo)(l/r)dro  = 
— ct2t2/ 3.  It  is  obvious  that  the  error  is  small,  compared 
with  the  dominant  term  ia2r2,  particularly  when  i  is  large. 
Furthermore,  the  accuracy  of  this  approximation  is  verified  in 
Section  VIII-C. 


A.  Catch-Up  Delay  in  a  Basic  Catch-Up  Process 

In  this  section,  we  temporarily  ignore  the  possibility  of 
overtaking,  i.e.,  we  consider  a  basic  catch-up  process  involving 
only  the  vehicle,  which  is  the  head  at  time  0,  catching  up  with 
the  vehicle,  which  is  the  tail  at  time  0.  We  have  the  following 
lemma  for  the  basic  catch-up  process: 

Lemma  2:  In  a  basic  catch-up  process  where  vehicle  speed 
follows  a  zero-mean  Gaussian  distribution  with  standard  devia¬ 
tion  (7,  the  probability  that  the  reduction  in  the  distance  between 
the  head  and  the  tail  is  z  for  the  first  time  during  the  ith  time 
slot  is 

«z'!)  =  i^AS“p("5i)-  (18) 

Proof:  Because  the  Gaussian  speed  distribution  is  sym¬ 
metric  with  respect  to  the  mean,  for  the  displacement  of  the  tail, 
we  have  pg(y ,  t)  =  p(y ,  t)  =  p(—y ,  t).  Therefore,  using  (6),  it 
can  be  shown  that 

OO 

pH(z,ir)  =  J  p(y,  ir )pg{z  —  y,  ir )dy 

— OO 

=(p,p>(z'ir>  =  Vs:exp(w)  <19) 

where  a2  =  2i(ctt)2.  Therefore 

h(z,  i )  =  rpH (z,  ir)  =  _  exp  ■  (20) 

Inspired  by  [20,  6.4],  h(z,  i )  in  (20)  can  be  rewritten  as  the 
following  to  calculate  the  Z-transform: 

exp  ^—jza  —  ^-a2^  da  (21) 


where  j  denotes  \/—l. 

Then,  perform  the  Z-transform  on  (21)  with  regard  to  i,  i.e., 


(Zh)(z,s)=^2e  Slh(z,i) 

i=  1 


r 

27 r 


OO 

J  exp  {—jza 


—  OO 


exp(-s*)exp^-^as 


i  da. 


With  of  =  2i  (err) 2  holds. 

OO 

(Zh)(z,s)  J  exp (-jza) 

—  OO 
OO 

x  exp(— si)  exp  (— z(crr)2a2)  da 
i=i 

OO 

=  /  exp(— jza)(s  +  <T2T2a2)~1da 

2-k  J 

—  OO 

exp  (^—Zy/ s/(ct2t2)^ 


T 

2 


Vs 


(23) 


Then,  according  to  (1 1),  we  have 

(Zh)(z\s) 


(Z£)(z,s)  = 


(Zh)(z'  —  z,  s ) 

exp  (^—z'V s/ ((j2t2)^ 
exp  (-(z*  -  z)Vs/{a2T2)Sj 
=  exp  (^-zV s/{a2T2)^j  . 


(24) 


(25) 


Finally,  using  the  inverse  Z-transform,  it  can  be  obtained  that 

,2  \ 


£(->*)  =  exP  - 


2iroVVi 


( _ *). 

\  4ir2cr2/ 


(26) 


We  say  that  one  vehicle  catches  up  with  another  vehicle  iff 
the  Euclidean  distance  between  them  reduces  to  the  radio  range 
r o-  Then,  using  Lemma  2,  one  can  readily  obtain  the  following 
result: 

Theorem  1:  Consider  two  vehicles  separated  by  Euclidean 
distance  lc  at  time  0.  The  probability  that  one  vehicle  catches 
up  with  the  other  for  the  first  time  in  the  ith  time  slot  is  (lc  — 
ro/2iraVVi)  exp (— (Zc  —  ro)2/4iT2<T2). 

Proof:  It  is  straightforward  that  the  reduction  in  distance 
is  z  =  lc  —  r o-  Then,  using  Lemma  2,  the  theorem  is  readily 
proved.  ■ 


B.  Catch-Up  Delay  With  Overtaking  Permitted 

In  the  previous  section,  the  possibility  of  overtaking  is  not 
included  in  the  calculation  of  the  first  passage  probability,  to 
obtain  a  closed-form  result  in  (26).  In  this  section,  the  possibil¬ 
ity  of  overtaking  is  considered  to  provide  a  more  accurate  result 
on  the  catch-up  delay.  Recall  that  Hm  is  the  ?nth  vehicle  to  the 
left  of  the  head  //(l  at  time  0  and  that  Pm/  is  the  m'th  vehicle 
to  the  right  of  the  tail  P0  at  time  0.  Note  that  all  the  vehicles 
Hm  ( Pm 0  for  m,  to'  S  [1,  oo)  can  possibly  overtake  the  head 
H0  (tail  P0). 

Lemma  3:  Denote  by  <?mm'(*|Zc)  the  probability  that  Hm 
catches  up  with  Pm/  (to,  to'  G  [0,  oo))  for  the  first  time  in  the 
ith  time  slot,  in  a  catch-up  process  with  gap  lr.  Then,  using 
Lemma  2,  we  have 


(22) 


qmm'(i\lc)  ~  £(zc  -  ro  +  m/p  +  m'/pfi).  (27) 
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Proof:  Recall  that  wm  is  the  distance  between  Hm  and 
//(i  at  time  0,  which  follows  an  exponential  distribution.  There¬ 
fore,  the  expected  distance  between  Hm  and  H0  at  time  0  is 
/0°°  Wmfwm(vJm)dwm  =  m/p,  where  fWm(wm )  is  given  by 
(4).  Similarly,  the  expected  distance  between  Pmi  and  If  at 
time  0  is  to' / p.  Then,  in  a  catch-up  process  with  gap  lc, 
the  expected  distance  between  Hm  and  Pm/  at  time  0  is 
lc  +  m/p  +  to'  / p.  In  order  for  Hm  to  catch  up  with  Prn/ ,  the 
reduction  in  distance  should  bs  z  =  lc  —  Tq  +  m / p  +  to' / p.  ■ 

Remark  1:  In  Lemma  3,  only  the  mean  value  of  the  distance 
between  vehicles  is  required.  This  provides  us  with  the  flexibil¬ 
ity  to  extend  the  analysis  from  the  Poisson  distribution  model 
to  another  vehicle  distribution  model,  i.e.,  we  only  need  to 
replace  m/ p  in  (27)  by  the  corresponding  average  intervehicle 
distance  if  a  different  vehicle  distribution  model  is  used.  The 
rest  of  the  analysis  on  the  catch-up  process  does  not  depend  on 
the  particular  vehicle  distribution  model  being  used.  However, 
the  accuracy  of  using  mean  value  approximation  for  another 
vehicle  distribution  needs  to  be  validated. 

Denote  by  H(i\lc)  the  probability  that  none  of  the  //,,,  — 
Pmi  pairs  (m,  m'  £  [0,  oo))  catches  up  in  the  ith  time  slot,  in 
a  catch-up  process  with  gap  lc.  Due  to  the  independence  of  the 
movements  of  vehicles,  we  have 

H(i\lc)=  n  (i-wC^c))  (28) 

0,oo) 

where  |(c)  is  given  by  Lemma  3.  During  numerical 

evaluation,  finite  values  of  m,  to'  can  provide  fairly  accurate 
results,  which  are  discussed  later  in  Section  VIII-A. 

Denote  by  h(i\lc)  the  probability  that  at  least  one  pair  of 
Hm  —  Pmi  catches  up  in  the  ith  time  slot  and  none  of  them  has 
caught  up  before  the  ith  time  slot,  in  a  catch-up  process  with 
gap  lc.  It  is  straightforward  that 

to- 1 

h(ic\lc)  =  (1  -  H{ic\lc))  H(i\lc).  (29) 
2=1 

Finally,  the  expected  delay  for  a  catch-up  process  with  gap 

lc  is 

OO 

E[tc\lc]  =  ^2  irh(i\lc).  (30) 

i= 1 


VI.  Analysis  on  the  Forwarding  Process 

In  a  forwarding  process,  the  packet  is  forwarded  in  a  multi¬ 
hop  manner  between  vehicles  inside  a  cluster.  We  start  with  the 
analysis  on  the  distribution  of  the  length  of  the  cluster. 

A.  Cluster  Length 

Define  the  cluster  length  xq  to  be  the  diameter  of  a  cluster, 
which  is  the  Euclidean  distance  between  the  vehicles  at  the 
two  ends  of  a  cluster.  Define  fXo(x o)  to  be  the  pdf  of  the 
cluster  length,  which  can  be  studied  as  the  pdf  of  the  busy 
period  in  queuing  theory.  In  previous  research,  only  numerical 
solutions  [3]  or  approximate  results  [5]  were  provided.  In  this 


section,  we  provide  a  closed-form  formula  for  the  pdf  of  the 
cluster  length  using  a  different  method  inspired  by  the  study  on 
the  connectivity  of  random  interval  graph  [22]  and  theory  of 
coverage  processes  [23]. 

Theorem  2:  In  a  VANET  where  the  spatial  distribution  of  the 
vehicles  follows  a  homogeneous  Poisson  process  with  intensity 
p,  the  pdf  of  the  cluster  length  is 


fx0 (xo) 


L*o/roJ 

_ t _  V 

(epr°  —  1)  £- 


{~p(x o  -  mr0))m  1 
—m\ 


x  (p(xo  —  mr0)  +  to)  e  pmr°  (31) 


where  to  is  an  integer,  and  |_-J  is  the  floor  function. 

The  proof  is  shown  in  the  Appendix. 

Theorem  2  shows  a  closed-form  formula  for  the  pdf  of  the 
cluster  length,  which  is  essential  for  analytical  study  on  the 
performance  of  VANETs. 


B.  Hop  Count  Statistics  in  a  Cluster 

To  study  the  IPS  in  the  forwarding  process,  the  number  of 
hops  between  the  leftmost  vehicle  and  the  rightmost  vehicle 
in  the  cluster  needs  to  be  calculated.  Two  vehicles  are  said  to 
be  k  hops  apart  if  the  shortest  path  between  them,  which  is 
measured  by  the  number  of  hops,  is  k.  Define  4>k(xo )  to  be  the 
probability  that  two  vehicles  separated  by  Euclidean  distance 
Xq  are  k  hops  apart.  It  is  assumed  that  the  positions  of  the 
vehicles  do  not  change  during  the  forwarding  process  since  the 
forwarding  delay  is  relatively  small,  which  is  also  confirmed  in 
Fig.  7.  Therefore,  the  probability  fk{xo)  can  be  calculated  by 
the  result  introduced  in  [24]  for  a  static  1-D  multi-hop  network. 

Define  Pr5(xo)  to  be  the  probability  of  successful  trans¬ 
missions  between  any  pair  of  vehicles  separated  by  Euclidean 
distance  xo-  An  end-to-end  packet  transmission  is  successful 
if  a  packet  can  reach  the  destination  by  any  number  of  hops. 
Therefore,  Prs(x0)  =  Y.'k-i  <t>k{x0). 

Define  (f>ks{x o)  to  be  the  conditional  probability  that  a 
packet  reaches  its  destination  at  the  fcsth  hop,  conditioned  on 
the  transmission  being  successful  and  the  Euclidean  distance 
between  source  and  destination  being  xq.  It  is  trivial  to  see  that 
(f>k{x o)  =  (pks(xo)  Prs(xo)-  Therefore,  the  expected  number  of 
hops  between  two  vehicles  separated  by  distance  xo,  given  that 
they  are  connected,  is 


(32) 

Prs(x0) 

Define  the  forwarding  delay  to  be  the  time  required  for 
a  packet  to  be  forwarded  from  the  leftmost  vehicle  to  the 
rightmost  vehicle  in  the  cluster,  which  is  1 1  —  t0  in  the  example 
shown  in  Fig.  1.  Then,  the  expected  forwarding  delay  in  a 
cluster  with  length  xo  is  E[tf\xo]  =  (3Es[ks\xf\. 

Remark  2:  The  hop  count  statistics  for  a  1-D  network  with 
inhomogeneous  Poisson  distribution  of  nodes  is  studied  in  [25], 
which  provides  us  with  the  required  methodology  for  extending 
our  analysis  on  the  forwarding  process  from  the  Poisson  distri¬ 
bution  model  to  another  vehicle  distribution  model. 
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VII.  Information  Propagation  Speed 

The  entire  information  propagation  process  can  be  consid¬ 
ered  as  a  renewal  reward  process  [26],  where  each  cycle  con¬ 
sists  of  a  catch-up  process,  followed  by  a  forwarding  process, 
and  the  reward  is  the  information  propagation  distance  during 
each  cycle.  As  mentioned  in  Section  III,  E[v\  is  the  constant 
component  of  the  vehicle  speed.  It  can  be  shown  that  [4],  [5] 


E[vi- 


expected  length  of  one  cycle 

: - \-E\v\ 

expected  time  duration  of  one  cycle 

Jr^lcfic  {Qdlc+ J™x0fXo  (x0)dx0 
/“  E  [tc  I  lc)  fic  (lc)  dlc +P+  3^—  /0°°  E  [tf  I  x0]  fX0  (x0)  dx0 


+E[v\ 


(33) 


where  pc=  2WminiVf/(W/min  +  1)2+  2WminNb  is  the  probabi¬ 
lity  of  collision  given  in  [27],  Wmin  is  the  minimum  contention 
window  size,  and  Nt)= p2nro  is  the  average  node  degree. 
Packet  collision  can  be  shown  to  have  negative  impact  on  the 
forwarding  process,  i.e.,  reducing  the  IPS,  when  the  vehicle 
density  is  high.  To  illustrate  this  effect,  we  conducted  more  sim¬ 
ulations  using  the  parameters  shown  in  [28],  i.e.,  Wmin  =  32. 


VIII.  Simulation  Results 

In  this  section,  we  report  on  simulations  to  validate  the 
accuracy  of  the  analytical  results  for  the  catch-up  process.  The 
simulations  are  conducted  in  a  VANET  simulator  written  in 
C++.  Each  point  shown  in  the  figures  is  the  average  value 
from  2000  simulations.  The  confidence  interval  is  too  small  to 
be  distinguishable  and,  hence,  is  ignored  in  the  following  plots. 
The  radio  range  is  r$  =  250  m  [5].  The  mobility  parameters 
are  E[v]  =  25  m/s  and  a  =  7.5  m/s  [15].  To  distinguish  the 
impact  on  the  IPS  of  packet  collision  and  other  parameters,  we 
let  pc  =  0,  except  in  Fig.  1 1 . 


A.  Catch-Up  Process 

As  mentioned  earlier,  we  use  r  =  1  s,  5  s  in  this  paper.  Only 
the  results  for  r  =  5  s  are  shown  in  this  section  since  the  results 
for  r  =  1  s  have  a  similar  (and  slightly  better)  accuracy.  The 
traffic  density  is  A  =  0.3  veh/s.  It  follows  that  the  spatial  distri¬ 
bution  of  the  vehicles  follows  a  homogeneous  Poisson  process 
with  intensity  p=  JQ  (fv(v)/v)dv  =  0.012  veh/m,  which  is 
a  low  traffic  density,  resulting  in  a  large  number  of  catch-up 
processes.  The  results  for  other  densities  are  quite  similar  and, 
hence,  are  not  shown  in  this  section. 

Fig.  5(a)  shows  the  probability  that  a  catch-up  process  with 
gap  lc  =  400  finishes  within  time  t.  It  can  be  seen  that,  when 
m  =  m!  =  4,  the  analytical  result  gives  a  good  approximation. 
Moreover,  considering  more  vehicles  in  the  overtake  process, 
e.g.,  m  =  6  or  m  =  8,  has  marginal  impact  on  the  results. 
This  is  because,  as  the  distance  between  vehicles  increases,  the 
probability  of  overtaking  rapidly  decreases,  which  can  be  seen 
in  Fig.  5(b).  Fig.  5(b)  shows  the  simulation  result  of  the  proba¬ 
bility  that  a  randomly  chosen  vehicle  overtakes  another  vehicle 
within  time  t  =  ir,  where  their  distance  is  z  at  time  0.  Because 


A=0.3  v/s,  r0=250m,  l=400m,  Gaussian  speed  a=7.5m/s,  x=5s 


Fig.  5.  (a)  CDF  of  the  catch-up  delay  for  a  catch-up  process  with  gap  lc  = 

400.  (b)  Simulation  results  on  the  probability  that  a  randomly  chosen  vehicle 
overtakes  another  vehicle  within  time  t,  where  their  initial  Euclidean  distance 
is  z  =  m  x  83  at  time  0. 


Fig.  6.  (a)  Expected  catch-up  delay,  (b)  PDF  of  the  length  of  the  gap  ( lc ). 


the  average  intervehicle  distance  is  1/p  =  1/0.012  «  83,  the 
curve  of  z  =  m  x  83  in  Fig.  5(b)  approximately  illustrates  the 
probability  that  vehicle  Hm  overtakes  Hq  before  time  /:.  As 
can  be  seen  in  the  figure,  the  probability  that  H4  overtakes  //() 
within  100  s  while  none  of  Hi,  H2,  and  //•(  overtakes  Hq  within 
100  s  is  approximately  given  by  0.2(1  —  0.72) (1  —  0.52) (1  — 
0.34)  =  0.01774.  It  can  be  understood  that  the  probability  that 
Hq  is  overtaken  by  another  vehicle,  e.g.,  ,  Hq  . . .,  is  very 

small.  Therefore,  considering  m  =  m!  =  4  can  provide  a  good 
approximation. 

Fig.  6(a)  shows  the  expected  catch-up  delay  for  a  catch-up 
process  with  gap  lc.  It  can  be  seen  that  the  analytical  result, 
which  considers  m  =  m'  =  4,  provides  a  good  approximation. 
The  discrepancy  between  the  simulation  result  and  the  ana¬ 
lytical  result  is  caused  by  the  approximations  used  during  the 
analysis.  Specifically,  the  first  passage  analysis  is  only  applied 
to  the  analysis  of  the  catch-up  process  between  a  pair  of  vehi¬ 
cles,  which  are  the  head  and  the  tail  at  the  start  of  the  catch-up 
process.  However,  the  first  passage  analysis  does  not  consider 
the  possibility  that  the  head  (the  tail)  may  be  overtaken  by  other 
vehicles  during  the  catch-up  process.  Furthermore,  Fig.  6(b) 
also  verifies  that  the  intervehicle  distance,  under  our  network 
model  and  the  Gaussian  speed  distribution,  still  follows  an 
exponential  distribution  with  p  =  0.012.  This  property  is  also 
expected  to  hold  in  some  other  speed  distributions,  which  is  an 
issue  that  is  left  as  future  work. 


B.  Forwarding  Process 

In  addition  to  the  simulation  settings  introduced  earlier,  the 
per-hop  delay  is  j3  =  4  ms  [5],  Fig.  7(a)  shows  the  expected 
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Fig.  7.  Expected  forwarding  delay  and  the  pdf  of  the  cluster  length, 
(a)  Expected  forwarding  delay,  (b)  PDF  of  the  cluster  length. 


Fig.  8.  Expected  IPS  for  r  =  1  s,  5  s. 

forwarding  delay  in  a  cluster  with  a  given  length.  Fig.  7(b) 
shows  the  pdf  of  the  cluster  length.  It  can  be  seen  that  the  analyt¬ 
ical  results  match  the  simulation  results  very  well.  The  results 
for  other  values  of  the  parameters  have  a  similar  accuracy  and 
are  thus  omitted.  Furthermore,  it  is  interesting  to  note  that,  in 
Fig.  7(b),  the  pdf  of  the  cluster  length  is  a  constant  for  ,x'0  £ 
[0, 250].  This  is  because,  within  the  radio  range  (r o  =  250)  of 
the  first  vehicle,  Pr(f4)  =  1  [(36)],  and  the  cluster  length  is  xo 
if  and  only  if  there  is  a  vehicle  in  [xo,  xq  +  dx o),  and  there  is 
no  vehicle  in  [xo  +  dx,  xq  +  ro).  It  follows  that  the  pdf  of  the 
cluster  length  is  a  constant  for  Xo  £  [0,  ro]. 

C.  IPS 

In  addition  to  the  simulation  settings  introduced  earlier,  the 
Poisson  arrival  rate  A  is  varied  from  0  to  1.5  veh/s.  With 
E[v\  =  25,  the  spatial  distribution  of  the  vehicles  follows  a 
homogeneous  Poisson  process  with  intensity  p  ranging  from  0 
to  0.06.  For  completeness  of  the  plot,  p  =  0  is  included,  which 
means  that  there  is  only  one  vehicle  on  the  road.  Therefore,  the 
average  number  of  neighbors  (average  node  degree)  varies  from 
0  to  30,  which  represents  a  large  range  of  traffic  densities. 

Fig.  8  shows  the  expected  IPS  for  r  =  1  s,  5  s.  It  can  be  seen 
that,  when  the  vehicle  density  is  low,  the  IPS  is  determined  by 
vehicle  speeds  because  there  is  little  packet  forwarding  in  the 
network.  When  the  vehicle  density  increases,  small  clusters  are 
formed,  and  the  IPS  is  determined  by  the  catch-up  delay,  which 
is  further  determined  by  the  mobility  of  the  vehicles.  It  can  be 


._.  r0=250m,  (3=4ms,  Gaussian  speed  a=7.5m/s,  constant  speed  model 


Fig.  9.  Expected  IPS  under  the  constant  speed  model,  together  with  the  curve 
for  r  =  1  for  comparison.  The  result  Analytical-wu  is  calculated  based  on  [5], 

seen  that  the  more  frequently  the  speed  changes,  the  slower 
the  information  propagates.  This  is  mainly  because  changing 
speed  has  the  potential  to  interrupt  the  catch-up  process,  i.e., 
during  a  catch-up  process,  the  tail  may  speed  up  and  the  head 
may  slow  down.  An  intuitive  explanation  can  be  provided  by 
considering  an  extreme  case.  In  the  extreme  case  that  the  speed- 
change  time  interval  tends  to  0,  it  can  be  shown  using  the  central 
limit  theorem  that  the  average  vehicular  speed  in  any  specified 
time  interval  converges  to  the  mean  speed  E[v].  Hence,  the 
network  topology  becomes  static,  and  the  expected  IPS  equals 
the  mean  speed  E[v\  because  there  is  no  catch-up  process  to 
bridge  the  gaps.  Finally,  as  the  vehicle  density  further  increases, 
clusters  become  larger,  and  the  forwarding  process  starts  to 
dominate.  Therefore,  the  IPS  increases  until  it  reaches  the 
maximum  value,  which  is  determined  by  the  per-hop  delay  in 
the  forwarding  process.  The  maximum  IPS  is  obviously  equal 
to  ro/ /3,  where  (3  is  the  per-hop  delay. 

Fig.  9  shows  the  expected  IPS  under  the  constant  speed 
model,  i.e.,  the  vehicle  speed  does  not  change  over  time.  The 
result  Analytical-wu  is  calculated  based  on  [5]  for  comparison. 
The  constant  speed  model  is  a  special  case  of  the  mobility 
model  used  in  this  paper,  i.e.,  when  r  — >  oo.  In  Fig.  9,  we 
choose  a  fairly  large  value  of  r  to  obtain  analytical  result  under 
our  mobility  model  and  use  the  result  as  an  approximation  of  the 
result  under  the  constant  speed  model.  The  result  Analytical- 
wu,  which  does  not  consider  first  passage  phenomenon,  pro¬ 
vides  a  good  match  with  the  simulations.  This  is  because  the 
first  passage  phenomenon  does  not  have  a  significant  impact 
when  the  vehicle  speed  does  not  change  over  time.  Finally,  it 
can  be  seen  that  the  constant  speed  model  used  in  previous 
research  (e.g.,  [5]  and  [8])  causes  serious  overestimation  of  the 
IPS  by  almost  an  order  of  magnitude.  Therefore,  time  variation 
of  vehicle  speed  is  an  important  factor  affecting  the  IPS. 

Fig.  10  shows  the  expected  IPS  under  nonsynchro nized 
mobility  models.  In  the  analysis,  we  choose  the  synchronized 
random  walk  mobility  model  to  study  the  impact  of  the  time 
variation  of  vehicular  speed  on  IPS.  To  verify  the  general 
applicability  of  the  analytical  study  based  on  the  simplified 
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r0=250m,  delay=4ms,  Gaussian  speed  g=7. 5m/s,  T=random  variable 


Fig.  10.  Expected  IPS  under  nonsynchronized  mobility  models. 

Unless  specified:  Unit  disk,  r0=250m,  delay=4ms, 


Fig.  11.  Expected  IPS  in  a  VANET  subject  to  log-normal  shadowing  and 
packet  collision,  where  tjl  is  the  path-loss  exponent,  and  is  the  standard 
deviation  of  the  log-normal  shadowing.  Furthermore,  when  a  head  cannot 
transmit  a  packet  to  any  uninformed  vehicle  to  the  right  of  itself,  the  head  keeps 
trying  to  retransmit  the  packet  after  every  0.9-s  time  delay.  The  value  is  chosen 
according  to  the  real-world  measurement  that  the  channel  coherent  time  of  a 
VANET  on  the  freeway  is  about  0.3-1. 5  s  [29]. 

mobility  model,  more  simulations  are  conducted.  As  shown 
in  Fig.  10,  three  different  mobility  models  are  evaluated,  i.e., 
the  speed-change  time  interval  r  of  each  vehicle  is  uniformly 
selected  from  [4.8,  5.2]  (i.e.,  about  5)  or  from  [2,  8]  (i.e.,  within 
a  larger  range  around  5)  and  for  r  following  an  exponential  dis¬ 
tribution  with  mean  5.  Under  all  three  mobility  models,  vehicles 
change  speed  at  different  time  instants  (nonsynchronized),  and 
the  average  speed-change  time  interval  is  5  s.  It  can  be  seen  that 
the  IPSs  under  nonsynchronized  mobility  models  are  very  close 
(almost  indistinguishable)  to  each  other,  and  our  analysis  using 
the  simplified  (synchronized)  mobility  model  provides  a  good 
estimation  on  the  IPS. 

We  consider  the  unit  disk  model  in  the  analysis.  The  unit 
disk  communication  model  is  constructed  based  on  the  path- 
loss  attenuation  model,  which  is  suitable  to  model  the  radio 
environment  in  free  space  without  clutters  [30].  Therefore,  the 
unit  disk  model  is  suitable  for  the  VANET  on  the  freeway. 
To  study  the  impact  of  clutters  such  as  road-side  buildings, 
simulation  results  of  the  expected  IPS  under  the  log-normal 


shadowing  model  [30]  are  shown  in  Fig.  11.  The  results  are 
compared  under  the  condition  that  the  average  node  degrees 
(i.e.,  the  average  number  of  neighbors  per  node)  under  the  log¬ 
normal  model  and  under  the  unit  disk  model  are  the  same.  In 
the  log-normal  shadowing  model,  the  received  signal  strength 
(RSS)  attenuation  (in  decibels)  follows  a  normal  distribution 
with  a  mean  value  equal  to  the  RSS  under  the  path-loss  atten¬ 
uation  model.  This  random  variation  on  the  RSS  attenuation 
provides  a  higher  chance  for  a  node  to  find  a  next-hop  neighbor. 
Hence,  even  with  the  same  average  node  degree,  the  IPS  under 
log-normal  shadowing  model  is  faster  than  that  under  unit  disk 
model.  A  similar  observation  is  obtained  in  the  study  of  network 
connectivity  in  [7]  and  [30,  Th.  2.5.2].  Therefore,  the  IPS  under 
the  unit  disk  model  can  be  considered  as  a  lower  bound  on  the 
IPS  of  a  VANET  in  the  real  world. 

In  addition,  the  third  and  fourth  curves  in  Fig.  1 1  show  the 
IPS  subject  to  packet  collision  with  collision  probability  pc 
given  in  Section  VII.  It  can  be  seen  that  the  packet  collision  has 
a  significant  impact  on  the  IPS,  particularly  when  the  vehicle 
density  is  high. 

IX.  Conclusion  and  Future  Work 

In  this  paper,  analytical  models  have  been  created  for  the 
information  propagation  process  in  a  mobile  ad  hoc  network 
formed  by  vehicles  moving  on  a  highway.  Analytical  results 
have  been  provided  for  the  expected  delay  in  catch-up  process, 
expected  delay  in  forwarding  process,  and  the  distribution  of 
the  cluster  length.  Based  on  the  aforementioned  results,  the  IPS 
has  been  derived.  It  has  been  shown  that  various  parameters, 
such  as  vehicle  density  and  time  variation  of  vehicle  speed,  can 
have  a  significant  impact  on  the  IPS.  By  taking  the  real-world 
measurements  such  as  A,  E[v],  a,  and  r,  our  results  can  provide 
a  quick  estimation  of  the  IPS  with  good  accuracy.  The  research 
provides  useful  guidelines  on  the  design  of  mobile  VANETs. 

The  analysis  in  this  paper  is  conducted  in  a  1-D  network. 
A  straightforward  extension  to  2-D  networks  is  to  consider  a 
Manhattan  model  [32],  i.e.,  grid  topology.  In  grid  topology, 
each  street  (edge)  can  be  treated  as  a  1-D  roadway,  which  can 
be  studied  using  the  results  obtained  in  this  paper.  Together 
with  existing  results  on  how  a  message  passes  through  a  road 
intersection  area  [33],  our  results  can  be  adapted  for  2-D  grid 
networks.  Furthermore,  for  unconstrained  mobility  in  2-D  or 
3-D  mobile  ad-hoc  networks,  the  methodology  developed  in 
this  paper  for  the  analysis  of  the  catch-up  process  and  forward¬ 
ing  process  can  also  be  applicable. 

Appendix 

Proof  of  Theorem  2 

Let  the  origin  of  the  axis  be  the  position  of  the  leftmost 
vehicle  of  a  cluster.  Let  N  be  the  random  variable  representing 
the  number  of  vehicles  in  the  cluster.  The  cluster  length  lies  in 
[xo,Xq  +  dx o),  and  there  are  n  vehicles  in  this  cluster  if  and 
only  if  the  conditions  given  here  hold. 

E\ :  There  is  a  vehicle  in  [xo,  xq  +  dx o). 

Z2.  There  is  no  vehicle  in  [x0  +  dx 0,  Xq  +  ro). 

£3:  There  are  n  —  2  vehicles  in  (0,  Xq). 
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£4:  The  intervehicle  distance  between  any  two  adjacent  vehicles 
for  those  n  vehicles  in  [0,xo  +  dx o)  is  smaller  than  or 
equal  to  r0 . 

Denote  by  Pr(£m)  the  probability  of  event  £m  in  the  pre¬ 
ceding  list.  Due  to  the  Poisson  distribution  of  vehicles,  it  is 
straightforward  to  show  that 


Pr(£i)  =pdx0,  Pr(£2)  =  e~pr°  (34) 


Pr(£3) 


(px  0)n-2e~pXo 
(n~  2)! 


(35) 


Furthermore,  Pr(£4)  can  be  studied  using  [22,  Lemma  1], 
which  provides  result  on  the  connectivity  of  random  interval 
graph.  In  [22],  vertices  are  uniformly  distributed  on  a  unit 
interval.  Due  to  the  Poisson  distribution  of  the  vehicles  in  our 
case,  given  that  there  are  n  vehicles  in  a  cluster  with  length  xo, 
these  vehicles  also  follow  a  uniform  distribution.  Therefore,  by 
scaling  the  cluster  length  xo  to  1  and,  consequently,  the  radio 
range  to  r(j  /xo ,  we  have  the  following  equation  from  Lemma  1 
in  [22]: 


min{n— 1,  L‘co/roJ } 

Pr(£4)  =  E 


m— 0 


n  —  1 


(-i)r 


i  r° 

1  —  m  — 
x0 


(36) 


where  to  is  an  integer,  and  [.J  is  the  floor  function.  For  the 
convenience  of  the  following  calculation,  let  ("-1)  =  0  for 
m  >  n  —  1.  Thus,  the  preceding  summation  is  from  to  =  0  to 

[xo/roj. 

Events  £i,  £2,  and  £3  are  independent  of  each  other.  Event 
£4,  which  is  conditioned  on  event  £3,  is  independent  of 
events  £1  and  £2.  Define  f(xg,N  =  n)  to  be  the  probability 
that  the  cluster  length  lies  in  [xq,Xq  +  dxo),  and  there  are 
n  vehicles  in  this  cluster.  It  is  evident  that  f(xo,N  =  n)  = 
Pr(£i)  Pr(£2)  Pr(£3)  Pr(£4).  Next,  we  derive  fXo(x 0)  using 
f(xo,N  =  n). 

If  the  cluster  only  consists  of  one  vehicle,  then  the  cluster 
length  is  0,  and  the  probability  of  this  event  is  e~pr°.  If  the 
cluster  consists  of  more  than  one  vehicle,  then  the  pdf  of  cluster 
length  xq  is 


fx  0(2:0) 


J2n=2  /(x0i  N  =  Tl) 

Pr (IV  >  2) 


pe~pr°  (pxo)n~2e~px° 
^  (n  —  2)!(1  —  e~pr°) 


X 


L*o/roJ 


E 

m= 0 


CEE 


n— 2 


Using  =  n\/m\{n  —  m)\,  the  summation  terms  in  (37) 
can  be  simplified  as 


00  /  \n—2  l^oAoJ  ,  1 

(pxo)  y'  in  -  1 

(n  _o)| 


(n-  2)! 

n— 2  v  '  m= 0 


(-l)m  (  1  —  TO— — 
X0 


_[y°J  (-!)">  ”(n-l) 
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(39) 


Using  xex  =  ^“^(ax^/a!),  (39)  can  be  written  as 

L*o /r0\ 


E 

m— 0 


EE  (p(XQ  -  mro))m  1 


m! 


x  (p(xo  -  rnr0 )  +  to)  eK*o-™-0)_  (40) 


E  ~7““  (P(*0  -  ™’o))m  1 


Substitute  the  preceding  equation  into  (37),  one  can  obtain 

oe~ proe~ pxo 

fx0(x0)=— - — —  ^ 

( 1  —  e  pr° )  t—*  to! 

x  (p(x 0  —  mro)  +  m)ep('Xo~mr° ^ 
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Abstract — This  paper  studies  the  information  propagation 
process  in  a  ID  mobile  ad-hoc  network  formed  by  vehicles 
traveling  on  a  highway.  We  consider  that  vehicles  can  be  divided 
into  traffic  streams;  vehicles  in  the  same  traffic  stream  have  the 
same  speed  distribution,  while  the  speed  distributions  of  vehicles 
in  different  traffic  streams  are  different.  Analytical  formulas 
are  derived  for  the  fundamental  properties  of  the  information 
propagation  process  as  well  as  the  information  propagation  speed. 
Using  the  formulas,  one  can  straightforwardly  study  the  impact 
on  the  information  propagation  speed  of  various  parameters 
such  as  radio  range,  vehicular  traffic  density,  vehicular  speed 
distribution  and  the  time  variation  of  vehicular  speed. 


it  is  available  for  further  retransmission  [4].  The  value  of  /? 
reflecting  the  typical  technology  is  4ms  [4].  We  show  that  the 
per-hop  delay  has  a  significant  impact  on  the  IPS. 
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I.  Introduction 

This  paper  studies  the  information  propagation  speed  (IPS), 
which  is  the  expected  propagation  speed  for  a  piece  of 
information  to  be  broadcast  along  the  road  in  a  vehicular  ad- 
hoc  network  (VANET).  The  IPS  is  an  important  performance 
metric  for  many  VANET  applications,  especially  for  the  safety 
messaging  applications  [1]. 

It  has  been  shown  [2],  [3]  that  a  VANET  is  usually  parti¬ 
tioned  into  a  number  of  clusters,  where  a  cluster  is  a  maximal 
set  of  vehicles  in  which  every  pair  of  vehicles  is  connected  by 
at  least  one  multi-hop  path.  Due  to  the  mobility  of  vehicles, 
the  clusters  are  splitting  and  merging  over  time.  Therefore, 
information  propagation  in  a  VANET  is  typically  based  on  a 
store-and-forward  scheme.  Considering  the  example  illustrated 
in  Fig.  1,  a  piece  of  information  starts  to  propagate  from 
the  origin  toward  the  positive  direction  of  the  axis  at  time 
to-  The  vehicles  that  have  received  this  piece  of  information 
are  referred  to  as  the  informed  vehicles,  where  other  vehicles 
are  uninformed.  The  multi-hop  forwarding  of  the  message 
within  a  cluster,  which  begins  at  tg  and  ends  at  1 1,  is  called  a 
forwarding  process.  In  a  forwarding  process  the  information 
propagation  speed  is  determined  by  the  per-hop  delay  and 
the  length  of  the  cluster.  The  per-hop  delay  /3  is  the  time 
required  for  a  vehicle  to  receive  and  process  a  message  before 
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Fig.  1.  Illustration  of  the  topology  of  a  VANET  at  different  time  instants. 
The  positive  direction  of  the  axis  is  the  direction  of  information  propagation. 

Define  the  head  at  time  t  as  the  informed  vehicle  with  the 
largest  coordinate  at  time  t.  Define  the  tail  at  time  t  as  the 
uninformed  vehicle  with  the  smallest  coordinate  at  time  t.  Two 
vehicles  can  directly  communicate  with  each  other  if  and  only 
if  their  Euclidean  distance  is  not  larger  than  the  radio  range  ro, 
i.e.  we  adopt  the  unit  disk  model.  As  shown  in  Fig.  1,  at  time 
t\  the  tail  is  outside  the  radio  range  of  the  head.  Then  a  catch¬ 
up  process  begins,  during  which  the  informed  vehicles  hold 
the  information  until  the  head  catches  up  the  tail  (at  time  £2)- 

Recent  research  has  shown  that  downstream  traffic  (a  set 
of  vehicles  traveling  in  the  opposite  direction  of  information 
propagation)  can  be  exploited  to  improve  the  IPS  [2],  [5],  [6]. 
Further,  real  world  measurements  show  that  vehicles  traveling 
in  different  lanes  (e.g.  bus  lane  or  heavy  truck  lane)  have 
different  speed  distributions  [7].  In  view  of  these  observations, 
we  consider  multiple  traffic  streams,  where  a  traffic  stream  is  a 
set  of  vehicles  following  the  same  speed  distribution.  A  traffic 
stream  does  not  have  to  consist  of  vehicles  traveling  in  the 
same  lane.  Traffic  streams  also  represent  different  types  of 
vehicles  (e.g.  sports  cars  or  heavy  trucks).  Interesting  results 
are  obtained  by  allowing  vehicles  in  different  traffic  streams 
to  have  different  speed  distributions. 

Main  contributions  of  this  paper  are:  firstly,  an  analytical 
model  for  the  information  propagation  process  considering 
multiple  traffic  streams  is  provided.  It  is  shown  that  a  small 
difference  in  the  average  vehicular  speeds  between  traffic 
streams  can  result  in  a  significant  increase  of  the  IPS.  Sec¬ 
ondly,  time  variation  of  vehicular  speed  is  considered  in  the 
analysis,  which  results  in  interesting  conclusions,  e.g.  the  IPSs 
are  similar  for  information  propagating  in  both  positive  and 
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negative  directions  of  a  road,  which  is  different  from  previous 
studies  (e.g.  [5],  [6])  considering  time-invariant  vehicular 
speed  only.  Based  on  the  analysis  of  the  catch-up  process  and 
forwarding  process,  analytical  results  for  the  IPS  are  derived, 
which  are  validated  using  simulations.  This  paper  provides 
useful  guidelines  on  the  design  of  a  multi-lane  VANET. 

The  rest  of  this  paper  is  organized  as  follows:  Section  II 
reviews  related  work.  Section  III  introduces  the  mobility  model 
and  network  model.  The  analysis  on  the  catch-up  process  is 
given  in  Section  IV.  The  forwarding  process  is  studied  in 
Section  V,  followed  by  the  results  of  the  IPS.  Section  VI 
validates  the  analysis  using  simulations.  Section  VII  concludes 
this  paper  and  discusses  future  work. 

II.  Related  work 

In  [2],  Agarwal  et  al.  studied  the  IPS  in  a  ID  VANET  where 
vehicles  are  Poissonly  distributed  and  move  at  the  same  speed, 
which  is  time-invariant,  but  either  in  the  positive  (upstream) 
or  negative  (downstream)  direction  of  the  axis.  The  upstream 
and  downstream  traffic  have  the  same  density.  This  work  was 
later  extended  in  [5]  to  allow  the  upstream  and  the  downstream 
traffic  to  have  different  densities.  The  authors  derived  upper 
and  lower  bounds  for  the  IPS,  which  provided  a  hint  on 
the  impact  of  vehicle  density  on  the  IPS.  Recent  research 
of  Baccelli  et  al.  [6]  provided  analytical  results  on  the  IPS 
under  the  same  setting  as  that  in  [5].  Note  that  the  analyses  in 
[2],  [5],  [6]  were  all  based  on  the  simplifying  assumption  that 
vehicles  in  each  traffic  stream  travel  at  the  same  speed,  which 
is  time-invariant.  Evidently,  in  such  model,  catch-up  can  only 
occur  via  the  vehicles  travelling  in  the  opposite  direction. 

In  [3],  Wu  et  al.  considered  a  ID  VANET  where  vehicles 
are  Poissonly  distributed  and  the  vehicle  speeds  are  uniformly 
distributed  within  a  designated  range.  Considering  one  traffic 
stream  and  time-invariant  vehicular  speed,  they  provided  an¬ 
alytical  results  on  the  IPS  when  the  vehicle  density  is  either 
very  low  or  very  high.  They  also  provided  a  numerical  method 
to  calculate  the  IPS  in  a  VANET  with  two  traffic  streams.  In  an 
earlier  work  [8],  we  showed  that  the  time-variation  of  vehicular 
speed  has  a  significant  impact  on  the  IPS.  However,  only  one 
traffic  stream  was  considered  in  [8],  i.e.  all  vehicles  travel 
in  the  same  direction  and  follow  the  same  speed  distribution. 
This  paper  considers  multiple  traffic  streams  and  evaluating 
the  impact  of  vehicular  densities  and  speed  distributions  on 
information  propagation  in  multi-lane  VANETs. 

III.  System  Model 

Suppose  that  there  are  a  total  of  N  traffic  streams,  wherein 
vehicles  are  free  to  change  lanes  and  overtake  other  vehicles. 
We  adopt  the  commonly-used  traffic  model  in  traffic  theory 
[9],  viz.  the  number  of  vehicles  in  the  nth  traffic  stream 
passing  an  observation  point  during  any  time  interval  follows 
a  homogeneous  Poisson  process  with  intensity  Xn  veh/s. 

A  synchronized  random  walk  mobility  model  is  considered. 
Specifically,  time  is  divided  into  time  slots  with  equal  length  r. 
Then,  the  ith  time  slot  is  t  £  ((i  —  l)r,  ir\.  Each  vehicle 
in  the  nth  ( n  £  [1,  A^])  traffic  stream  chooses  its  speed 


randomly  at  the  beginning  of  each  time  slot,  independent  of 
the  speeds  of  other  vehicles  and  its  own  speed  in  other  time 
slots,  according  to  a  probability  density  function  (pdf)  fvn(v). 
In  this  paper,  we  consider  the  Gaussian  speed  distribution,  i.e. 
fvniy)  ~  A f(pn,  cr2),  where  pn  (resp.  cr2)  is  the  mean  speed 
(resp.  variance)  in  the  nth  traffic  stream.  The  Gaussian  speed 
distribution  is  commonly  used  for  modeling  the  VANETs  on 
a  freeway  [3],  [9],  [10]. 

Under  the  aforementioned  setting,  it  can  be  shown  that  at 
any  time  instant  the  spatial  distribution  of  the  vehicles  in  the 
nth  traffic  stream  follows  a  homogeneous  Poisson  process 
with  intensity  pn  =  Xn  J”  An(f)  [g].  Then  according 
to  the  superposition  property  of  the  Poisson  process,  at  any 
time  instant  the  spatial  distribution  of  all  the  vehicles  on  the 
road  follows  a  homogeneous  Poisson  process  with  intensity 
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IV.  Catch-up  process 

Without  loss  of  generality,  it  is  assumed  that  the  catch-up 
process  starts  at  time  0.  The  displacement  x  (x  £  (—00,00)) 
of  a  vehicle  at  time  t  is  defined  to  be  the  distance  between  the 
position  of  the  vehicle  at  time  0  and  its  position  at  time  t. 

This  section  is  organized  as  follows:  in  Lemma  1,  we  study 
the  movement  of  a  single  vehicle.  The  result  on  the  distance 
between  two  vehicles  at  a  given  time  is  summarized  in  Lemma 
2.  After  considering  the  overtaking  of  vehicles  in  Section  IV-C, 
we  obtain  the  delay  for  a  catch-up  process  in  Section  IV-D. 


A.  Modeling  the  movement  of  a  single  vehicle 

Denote  by  pn(x,t)  the  probability  that  the  displacement  of 
a  vehicle  in  the  nth  traffic  stream  is  x  at  time  t.  We  have: 

Lemma  1:  Under  the  system  model  introduced  in  Section 
III,  when  t  =  it. 


Pn(x,t)  =  pn(x,  IT) 


,-{x~  Pnir)2 

exP( - 2^ - ) 


(1) 


where  of  =  i(cr„r)2. 

Proof:  According  to  Gaussian  speed  distribution  model, 
the  pdf  of  the  speed  of  a  vehicle  in  the  nth  traffic  stream  is: 


fvn{v) 


-(v  -  Pn)2 

exp(  M  1 


(2) 


Because  the  speed  does  not  change  during  a  time  slot,  it  is 
straightforward  that  pn(x,r)  is  also  a  Gaussian  function: 


Pn{x,r) 


1 

anrV2Tr 


exp( 


-{x  -  pnT)2, 
2(anr)2 


(3) 


Due  to  the  independence  of  the  vehicular  speeds  in  different 
time  slots  (hence  the  displacements),  we  have  for  t  =  ir: 

i-fold  convolution 

/ - A - s 

Pn(x,t)  =  pn(x,ir)  =  ( pn  *pn*  ■■■  *pn)(x,T)  (4) 


Then  one  can  obtain  Eq.  1  using  the  property  that  the 
convolution  of  two  Gaussian  functions  is  a  Gaussian  function. 
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B.  The  distance  between  a  pair  of  vehicles 

As  shown  in  Fig.  2,  denote  by  Hm  (resp.  Pm )  the  mth 
vehicle  to  the  left  of  the  head  //0  (resp.  to  the  right  of  the  tail 
If)  at  time  0.  If  Hm  happens  to  be  in  the  nth  traffic  stream, 
which  happens  with  a  probability  ,  then  an  additional  label 
n  (e.g.  Hff)  is  used  to  indicate  that  the  vehicle  Hm  is  in  the 
nth  traffic  stream.  In  this  subsection,  we  study  the  distance 
between  Hf  and  P 'f,,  where  n,n!  €  [1,./V]  and  m,m!  are 
non-negative  integers. 
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Fig.  2.  Illustration  of  a  VANET  at  the  beginning  of  a  catch-up  process. 


The  catch-up  process  between  Hf  and  I  ff  finishes  as  soon 
as  the  distance  between  them  reduces  to  radio  range  ro  for  the 
first  time.  Note  that  the  reduction  of  distance  between  Hf  and 
Iff  can  reach  a  given  value  z  at  several  time  instants  with 
non-zero  probabilities.  We  are  interested  in  the  probability  that 
the  reduction  of  the  distance  between  Hf  and  Iff  reaches  z 
for  the  first  time ,  i.e.  the  first  passage  time  phenomenon,  which 
is  essential  for  the  analysis  of  the  catch-up  process. 

Lemma  2:  Denote  by  Gffm,  ( z,i )  the  probability  that  the 
reduction  of  the  distance  between  Hf  and  Iff,  with  regards 
to  their  distance  at  time  0,  reaches  z  (z  ^  0)  for  the  first  time 
in  the  ith  (i  >  1)  time  slot.  Then:  (a)  When  pn  =  pn>: 
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(z,i)  = 


exp(; 


— z 


ix/2niT2(al  +  of)  2it2(o2  +  of) 


:)  (5) 


(b)  When  pn  f  \in >  and  i  =  1: 


r1 
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OM)  <  a(1  -  erfC  nM)) 


(6) 


(c)  When  pn  pn'  and  i  >  2: 


<Cw(*,0  <  \[l  +  erf(fJ±i)){l-ernf£))  (7) 

where  pt  =  pnir  -  pn>iT,  o2  =  (a2  +  of)T2i. 

Proof:  Denote  by  gfffi  (z,  t)  the  probability  that  the 
reduction  of  the  distance  between  Hfn  and  Iff  is  z  at  time  t. 
When  t  =  it. 


fjnn' 

iJmm 


,{z,t) 


/OO 

pn(x,t)pn'(x  -  Z,  t)dx 

-oo 

/OO 

pn(x,t)pn'(z  -  X,t)dx 

-oo 

(Pn  *  Pn')(x,t) 

1  -( z-pt )2 

eXP<  2a?  1 


A 


(8) 

(9) 

(10) 

(11) 


where  pt  =  (pn  -  p„>)t,  o \  =  (a2  +  o f)rt  and  Eq.  11 
is  obtained  from  Eq.  10  using  Lemma  1.  Note  that  Eq.  9  is 
obtained  from  Eq.  8  by  letting  the  mean  speed  of  Pf,  be 
0  instead  of  pni ,  and  consequently  the  mean  speed  of  lif 


becomes  pn  —  pn>.  Therefore,  pnf  x,  t)  becomes  a  zero  mean 
Gaussian  function  with  respect  to  x  hence  pni  (x  —  z,t)  = 
pnfz  -  x,t). 

When  two  traffic  streams  have  the  same  mean  speed,  i.e. 
pn  =  pn',  we  have  a  special  case  that  pt  =  0  in  Eq.  11.  For 
simplicity,  the  probability  that  the  reduction  of  the  distance 
between  Hf  and  Iff  reaches  z  in  the  ith  time  slot,  i.e. 
/(7-i)r  Cm'  L,t)dt  is  approximated  by  rg™m,(z,iT).  That 
is  we  approximately  consider  that  t)  =  g™m,(z,iT) 

for  f  £  ((i  —  l)r,  it\.  Applying  a  standard  procedure  [8]  to 
determine  the  first  passage  probability  Gffn,  (z,  i),  we  have: 


Tgfffifzffr)  =rY,GZU^i)gnS'^  -  (*'  -  (12) 


Solving  the  equation  by  the  Z-transform  as  introduced  in 
our  previous  work  [8],  one  can  obtain  that  for  pn  =  pn> : 

2 

Gw(M)  =  .  ,  .  ;<  2(  2,  2  J  (13) 

i  v/27rjr2(a2  +  a2,)  2  IT2  (ct2  +  of ) 

When  pn  ^  pn',  we  use  a  method  different  from  the 
complicated  first  passage  analysis  to  calculate  Gffm,  (z,  i) 
from  gfff/  (z,  t).  Define  A  to  be  the  event  that  the  reduction 
of  distance  between  Hf  and  Iff  is  smaller  than  z  at  time 
(i  —  l)r.  Define  B  to  be  the  event  that  the  reduction  of  distance 
between  Hrf  and  Iff  is  larger  than  z  at  time  it.  Then  for 
Pn  0  Pn'  and  i  f  2. 


Gw  CM)  <  Pt(A)Pt(B\A)  <  Pr(Zl)  Pr(S) 


(14) 


nZ  pOO 

/  9mm'{zo,{i~1)T)dzo  g™m,(z0,iT)dz0  (15) 
Jo  j  z 


,  ttz~  Vi-iss1,!  nz~ILA\ 

=  7,0  +  erf  (  - ))-(l  -  er/(— ==)) 

2  /  o-2  2  ^2o2 


y/2{Ti-i 


(16) 


where  pi  =  pniT  —  pnaT ,  of  =  (u2  +  cr2,)r2i.  The  first 
inequality  is  due  to  the  fact  that  we  do  not  consider  the  first 
passage  phenomenon.  Hence  Pr(„4)  Pr(S|^4)  can  be  larger 
than  the  probability  that  the  reduction  of  distance  between 
Hf  and  Pf,  reaches  z  for  the  first  time  in  the  ith  time 
slot.  This  bound  is  fairly  tight  in  the  case  pn  ^  pn'  that  we 
considered  here.  Because  when  pn  ^  pn',  gffn'  0  becomes 
a  Gaussian  function  with  a  non-zero  mean  according  to  Eq. 
11.  It  follows  that  the  mean  reduction  of  distance  between 
Hf  and  Pf,  becomes  a  decreasing  (when  pn  <  pnf  or 
increasing  (when  pn  >  //,,/ )  function  of  t.  Therefore  the 
first  passage  phenomenon  is  less  notable  when  pn  f  pr,/ . 
Secondly,  the  bound  Pr(S|^4)  <  Pr(H)  in  Eq.  14  is  due  to  the 
observation  that  A  and  B  are  negatively  correlated,  i.e.  given 
that  the  reduction  of  distance  is  less  than  z  at  time  (z  —  l)r,  the 
reduction  of  distance  is  less  likely  to  be  larger  than  z  at  time 
it.  Further,  this  correlation  appears  to  have  limited  impact  on 
the  final  result  under  the  settings  introduced  in  Section  VI. 

For  pn  pn '  and  i  =  1,  Pr(„4)  =  1.  Therefore: 
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(z,i)  <  Pr(B)  =  ^(1  -er/(^=^))  (17) 
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C.  The  catch-up  process  between  Hm  and  Pmi 

A  catch-up  process,  where  the  distance  between  the  head  Hq 
and  tail  Pq  is  lc  at  time  0,  is  referred  to  as  a  catch-up  process 
with  gap  lc.  Note  that  the  head  at  time  t  is  not  necessarily  the 
head  vehicle  at  time  0  (Ho)  because  Hq  may  be  overtaken  by 
another  informed  vehicle  during  time  (0,  £].  We  first  consider 
the  catch-up  process  between  a  pair  of  vehicles  Hm-Pmi. 

Denote  by  |(c)  the  probability  that  Hm  catches  up 

Pm'  for  the  first  time  in  the  ith  time  slot,  in  a  catch-up  process 
with  gap  lc.  Recall  that  with  probability  —  (resp.  — )  that 
Hm  (resp.  Pm')  belongs  to  the  nth  (resp.  n'th)  traffic  stream. 
It  can  be  shown  that 

qmm'(i\lc)=  E  (18) 

Tljn'GfljIV] 

where  z  =  lc  —  r0  +  wm  +  wm>  and  wm  (resp.  wm')  is  the 
expected  distance  between  Hm  and  H0  (resp.  Pmi  and  P0) 
at  time  0  as  shown  in  Fig.  2.  Due  to  the  Poisson  distribution 
of  vehicles,  the  inter-vehicle  distance  follows  an  exponential 
distribution  with  mean  1/p.  It  follows  that  wm  =  m/p,  wmi  = 
to'/ p.  Therefore  z  =  lc  —  ro  +  m/ p  +  m! / p. 

D.  Delay  of  a  catch-up  process 

Define  the  catch-up  delay  tc  to  be  the  time  taken  from  the 
beginning  of  a  catch-up  process  till  the  time  when  the  head 
and  tail  move  into  the  radio  range  of  each  other  for  the  first 
time,  i.e.  £2  —  £1  in  Fig-  1-  Denote  by  H(i\lc)  the  probability 
that  none  of  the  Hm-Pmf  pairs  catches  up  in  the  ith  time  slot, 
in  a  catch-up  process  with  gap  lc.  Due  to  the  independence 
between  the  movements  of  vehicles,  we  have: 

H(i\lc)=  n  (1  -qmm'Wc))  (19) 

m,m'G  [0,oo) 

where  qmm'(i\lc)  is  given  by  Eq.  18. 

Denote  by  h(i\lc)  the  probability  that  at  least  one  pair  of 
Hm-Pm'  catches  up  in  the  ith  time  slot  and  none  of  them  has 
caught  up  before  the  ith  time  slot,  in  a  catch-up  process  with 
gap  lc.  Assume  that  the  catch-up  event  in  the  ith  time  slot  is 
independent  of  that  in  the  jth  time  slot  for  0  <j<i,  which 
is  an  accurate  approximation  when  the  duration  of  a  time  slot 
is  large,  e.g.  r  =  Is  or  5s  as  shown  in  Section  VI.  Then: 

ic-l 

h(ic\lc)  =  (1  -  H(ic\lc))  H(i\lc)  (20) 

i= 1 

Finally  the  expected  delay  for  a  catch-up  process  with  gap  lc 

is  E[tc\lc]  =  EZiiThWc) 

V.  Forwarding  process  and  IPS 

Define  forwarding  delay  as  the  time  required  for  a  packet 
to  be  forwarded  from  the  leftmost  vehicle  to  the  rightmost 
vehicle  in  a  cluster.  Assume  that  a  cluster  does  not  become 
disconnected  during  the  forwarding  process  since  the  per-hop 
delay  is  relatively  small  (e.g.  4ms  [4]).  Then,  the  expected 
forwarding  delay  in  a  cluster  with  length  Xq  is  E[tf\xo]  = 
/3E[k\xo],  where  E’f/cjio]  (given  in  [8])  is  the  expected  number 
of  hops  between  two  vehicles  separated  by  xq. 


The  entire  information  propagation  process  can  be  consid¬ 
ered  as  a  renewal  reward  process  where  each  cycle  consists 
of  a  catch-up  process  followed  by  a  forwarding  process  and 
the  reward  is  the  information  propagation  distance  during  each 
cycle.  Therefore,  the  expected  IPS  ( E[viP ])  is  [3]: 

expected  length  of  one  cycle 
tp  expected  time  duration  of  one  cycle  ^ 

JZlcfic(lc)dlc  +  fZ  Xofx0(x0)dx0 

E[tc\lc]flc(lc)dlc  +  (i  +  f0°°  E{tf\x0}fXo(xo)dx0 

+p  max  (21) 

where  pmax  =  maxngjvj/^n}  is  the  maximum  average  speed 
among  traffic  streams  and  fic(lc)  =  pe~p<da~r°(  is  the  pdf  of 
distance  between  two  adjacent  but  disconnected  vehicles  [8]. 

VI.  Simulation  results 

Simulations  are  conducted  in  a  VANET  simulator  written 
in  C++.  Each  point  shown  in  the  figures  is  the  average  value 
from  2000  simulations.  The  confidence  interval  is  too  small  to 
be  distinguishable  and  hence  is  ignored  in  the  following  plots. 
The  radio  range  is  r 0  =  250m  [3].  The  typical  values  of  the 
mean  and  standard  deviation  of  vehicular  speed  are  25m/s  and 
7.5m/s  [10].  These  mobility  parameters,  i.e.  pn,  on  and  r,  are 
taken  from  practical  measurements,  where  the  usual  record 
time  intervals  for  a  vehicle  speed  monitor  are  r  =  Is,  5s  [11]. 
The  traffic  density  is  varied  so  that  the  density  p  varies  from  0 
to  0.06  veh/m.  For  completeness  of  the  plot,  p  -  0  is  included 
which  means  there  is  only  one  vehicle  in  each  traffic  stream. 

Fig.  3  and  Fig.  4  show  the  expected  IPS  in  a  VANET. 
Firstly  it  can  be  seen  that  analytical  results  have  a  good  match 
with  simulation  results.  Secondly,  when  the  vehicular  density 
is  low,  the  IPS  is  determined  by  vehicular  speeds  (whose 
mean  value  is  25m/s)  because  there  is  little  packet  forwarding. 
Thirdly,  when  the  vehicular  density  increases,  small  clusters 
are  formed  and  the  IPS  increases.  As  the  vehicular  density 
further  increases,  clusters  become  larger  and  the  forwarding 
process  starts  to  dominate.  Therefore,  the  IPS  increases  until 
it  reaches  the  maximum  value  r0//3,  where  /?  is  the  per-hop 
delay.  When  the  vehicular  density  is  moderate,  the  IPS  is 
determined  by  the  catch-up  delay,  which  is  further  determined 
by  the  mobility  of  vehicles,  as  shown  in  our  previous  work  [8]. 

Moreover,  vehicular  speed  distribution  also  has  a  significant 
impact  on  the  catch-up  process  hence  the  IPS.  Fig.  3  shows 
the  IPS  in  a  VANET  with  two  traffic  streams  with  equal 
vehicular  density  but  different  vehicular  speed  distributions. 
It  can  be  seen  that  our  analytical  result  has  a  better  match 
with  the  simulation  results  than  that  in  [6],  which  does  not 
consider  the  catch-up  process  studied  in  Lemma  2  of  this 
paper.  An  interesting  observation  is  that  the  IPS  increases 
when  the  average  speed  of  the  vehicles  in  one  of  the  traffic 
streams  is  reduced  from  25m/s  to  Om/s.  Further,  an  even  faster 
IPS  is  observed  when  the  average  speed  of  the  vehicles  in 
one  of  the  traffic  streams  is  -25m/s,  i.e.  two  traffic  streams 
head  in  opposite  directions.  The  reason  behind  this  interesting 
observation  is  that  a  larger  relative  speed  between  vehicles 
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results  in  faster  catch-up  processes  hence  a  faster  IPS.  This 
can  also  be  seen  from  the  analytical  results  of 
e.g.  applying  the  fact  that  the  error  function  is  an  increasing 
function  to  Eq.  6.  This  observation  tells  us  that  making  use 
of  the  vehicles  in  the  negative  traffic  stream  can  increase 
the  IPS.  Moreover,  some  (stationary)  roadside  units  without 
(expensive)  wired  connections  can  also  significantly  increase 
the  IPS  in  a  VANET. 


Fig.  3.  The  expected  IPS  in  a  VANET  with  different  vehicular  speeds  in 
two  traffic  streams  (N  =  2).  The  plot  Baccelli  shows  the  IPS  derived  in  [6] 
for  a  VANET  with  vehicular  speed  25  and  —25  in  two  streams  respectively. 

Fig.  4  shows  the  expected  IPS  in  a  VANET  with  different 
vehicular  densities  in  two  traffic  streams.  Firstly,  it  can  be 
seen  that  an  uneven  distribution  of  vehicular  densities  between 
traffic  streams  (e.g.  pi  =  p/10, p2  =  9/9/10)  results  in 
a  slower  IPS,  compared  with  the  IPS  in  a  VANET  with 
evenly  distributed  traffic  densities  (e.g.  p\  =  p2  =  p/ 2). 
This  is  because  an  uneven  distribution  of  vehicular  densities 
between  traffic  streams  results  in  a  smaller  number  of  catch¬ 
ups  between  vehicles  in  different  traffic  streams  (as  manifested 
by  the  factor  PnP2n'  in  Eq.  18),  hence  less  improvement  on  IPS 
is  provided  by  the  large  relative  vehicular  speed,  compared 
with  a  VANET  with  evenly  distributed  vehicular  densities 
between  traffic  streams.  This  situation  can  be  observed  on  free¬ 
ways  connecting  the  business  district  and  residential  district. 
Densities  of  the  traffic  streams  in  opposite  directions  can  vary 
greatly  depending  on  time-of-day  when  people  going  to  work 
or  coming  back  home.  On  the  other  hand,  it  can  be  seen  that 
a  small  amount  (e.g.  p/10)  of  vehicular  traffic  in  an  opposite 
direction  can  still  significantly  increase  the  IPS.  Therefore,  it 
is  meaningful  to  exploit  the  traffic  streams  with  different  mean 
speeds  (e.g.  heavy  trucks  or  buses). 

Moreover,  it  can  be  seen  in  Fig.  4  that  the  last  four  curves 
are  very  close  to  each  other,  which  suggests  that  given  a  con¬ 
figuration  of  the  vehicular  densities  and  speed  distributions  in  a 
VANET,  the  IPSs  are  similar  for  the  information  propagating 
in  both  positive  and  negative  directions.  This  conclusion  is 
different  from  previous  ones  that  were  based  on  the  time- 
invariant  speed  model  [5],  [6].  Our  results  suggest  that  a 
similar  IPS  is  achievable  in  both  directions  of  a  two-way 
communication  despite  the  uneven  vehicular  density  between 
traffic  streams. 

VII.  Conclusion  and  future  work 

In  this  paper,  analytical  results  are  provided  for  the  in¬ 
formation  propagation  process  in  a  mobile  ad-hoc  network 


Fig.  4.  The  expected  IPS  in  a  VANET  with  different  vehicular  densities  in 
two  traffic  streams.  The  first  four  curves  are  kept  from  Fig.  3  for  comparison. 

formed  by  vehicles  moving  on  a  freeway.  It  is  considered  that 
vehicles  can  be  divided  into  multiple  traffic  streams.  Vehicles 
in  the  same  traffic  stream  have  the  same  speed  distribution 
while  the  speed  distributions  of  vehicles  in  different  traffic 
streams  are  different.  We  found  that  the  IPS  can  be  boosted 
significantly  by  exploiting  the  existence  of  even  a  small 
number  of  vehicles  traveling  with  a  different  mean  speed, 
such  as  vehicles  traveling  in  the  opposite  direction  or  heavy 
trucks  moving  in  the  same  direction  but  with  a  slower  speed. 
On  the  other  hand,  the  IPSs  are  similar  for  the  information 
propagating  toward  both  positive  and  negative  directions  in  the 
case  of  uneven  traffic  densities  among  various  traffic  streams. 
By  using  parameters  extracted  from  real  world  measurements 
for  A,  p,  <t  and  r,  our  results  can  provide  a  quick  estimate 
of  the  IPS  with  a  good  accuracy.  The  analysis  can  be  further 
extended  to  2D  topologies  in  the  future.  Further,  we  are  going 
to  consider  the  impact  of  channel  randomness  on  the  IPS. 
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Abstract — In  this  paper,  we  consider  2D  wireless  multi-hop 
networks  with  mobile  nodes  randomly  distributed  on  a  torus, 
and  a  small  number  of  base  stations  (infrastructure  nodes) 
deterministically  placed  in  the  same  area.  Mobile  nodes  move 
following  a  random  walk  mobility  model.  A  piece  of  information 
is  broadcast  from  the  base  stations  at  the  same  time  in  a 
multi-hop  manner  using  a  Susceptible-Infectious-Recovered  (SIR) 
epidemic  routing  algorithm.  A  distinguishing  feature  of  the  SIR 
algorithm,  which  leverages  the  mobility  of  mobile  users,  is  that 
a  relay  node  carries  a  piece  of  information  for  a  pre-determined 
amount  of  time  and  forwards  it  at  any  available  opportunity 
during  that  time.  We  provide  analytical  results  for  the  percolation 
probability  and  for  the  expected  fraction  of  nodes  that  receive  the 
information  when  the  information  dissemination  process  stops. 
Further,  we  study  the  time  delay  of  the  information  dissemination 
process.  The  accuracy  of  the  analytical  results  is  verified  using 
simulations. 

Index  Terms — mobile  ad-hoc  networks,  infrastructure,  perco¬ 
lation,  random  walk,  epidemic  routing 

I.  Introduction 

In  this  paper,  we  consider  an  infrastructure-based  mobile 
ad-hoc  network  (MANET)  with  a  large  number  of  mobile 
nodes  (MNs)  moving  following  a  random  walk  mobility 
model  and  a  small  number  of  base  stations  /  infrastructure 
nodes  (INs).  Examples  of  such  infrastructure-based  MANETs 
include  wildlife  tracking  sensor  networks,  vehicular  ad-hoc 
networks  and  mobile  social  networks  [1],  Taking  the  mobile 
social  network  as  an  example,  the  information  (news  or 
advertisement)  dissemination  relies  not  only  on  the  mobile 
broadband  connection  between  mobile  users  and  the  base 
stations,  but  also  on  the  local  ad-hoc  connections  between 
mobile  users  that  emerge  as  mobile  users  move  and  meet 
each  other.  This  ad-hoc  method  of  information  dissemination 
can  reduce  the  resource  usage  of  base  stations  and  lower  the 
cost  for  service  providers  and  users  by  utilizing  cheaper  radio 
resources,  such  as  Bluetooth  or  cognitive  radio,  and  physical 
mobility  of  the  mobile  users,  for  information  dissemination. 
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Government. 


Moreover,  in  the  case  of  natural  disasters  causing  damage  to 
or  loss  of  a  base  station,  ad-hoc  communications  between 
mobile  devices  may  become  crucial  for  survivors.  In  this 
paper,  we  investigate  the  information  dissemination  process 
in  infrastructure-based  MANETs. 

The  topology  of  a  MANET  often  resembles  the  topology  of 
a  human  network,  in  the  sense  that  the  mobility  of  nodes  in 
a  MANET  is  not  only  similar  to,  but  often  governed  by,  the 
movements  of  their  human  owners.  In  view  of  this,  epidemic 
routing  algorithms  [2]  have  been  proposed  as  a  fast  and 
reliable  approach  to  disseminate  information  in  MANETs.  An 
epidemic  routing  algorithm  adopts  the  so-called  store-carry¬ 
forward  paradigm,  where  a  mobile  node  stores  and  carries 
its  received  information  and  then  forwards  the  information 
to  its  neighbors  when  direct  links  to  them  emerge.  As  a 
consequence  of  the  store-carry-forward  paradigm,  the  infor¬ 
mation  is  forwarded  from  a  source  to  a  destination  using  a 
journey  instead  of  a  path,  where  a  journey  is  an  alternation  of 
packet  transmissions  and  carriages,  that  connects  a  source  to  a 
destination  [3],  The  epidemic  routing  algorithm  leverages  the 
mobility  of  nodes  and  efficiently  disseminates  information  in 
MANETs  [2],  [4], 

In  this  paper,  we  study  the  information  dissemination  pro¬ 
cess  in  a  2D  infrastructure-based  MANET,  where  a  piece 
of  information  is  broadcast  from  the  infrastructure  nodes  to 
the  mobile  nodes  in  a  multi-hop  manner  using  a  Susceptible- 
Infectious-Recovered  (SIR)  epidemic  routing  algorithm,  which 
is  introduced  in  detail  in  Section  III.  An  asymptotic  upper 
bound  is  derived  for  the  percolation  probability,  viz.  the  prob¬ 
ability  that  a  non-vanishingly-small  fraction  of  MNs  receive 
the  information  as  the  number  of  MNs  approaches  infinity. 
Then  we  study  the  expected  fraction  of  MNs  that  receive  the 
information  when  the  information  dissemination  process  stops. 
Further,  we  derive  a  lower  bound  on  the  time  delay  of  the 
information  dissemination  process.  Finally,  the  accuracy  of  the 
analytical  results  is  verified  using  simulations. 

The  rest  of  this  paper  is  organized  as  follows:  Section  II 
reviews  the  related  work.  Section  III  introduces  the  network 
model  and  the  epidemic  routing  algorithm  considered  in  this 
paper.  The  analysis  on  the  information  dissemination  process 
is  given  in  Section  IV.  Section  V  validates  the  analysis 
using  simulations.  Finally  Section  VI  concludes  this  paper  and 
proposes  possible  future  work. 


II.  Related  work 

Some  related  work  treating  infrastructure-based  ad-hoc  net¬ 
works  can  be  found  in  the  literature  in  the  context  of  static 
networks.  A  static  network  is  a  special  case  of  a  MANET 
when  the  MNs  are  stationary,  in  which  case  the  MNs  are 
referred  to  as  the  ordinary  nodes  (ONs).  Alternatively,  a  static 
network  can  be  considered  to  be  a  snapshot  of  a  MANET.  In 
[5],  Bermudez  and  Wicker  considered  a  network  where  ONs 
are  uniformly  distributed  in  a  unit  square,  with  one  IN  at  each 
corner  of  the  square.  They  empirically  investigated  the  fraction 
of  ONs  that  are  connected  to  at  least  one  IN  by  Monte  Carlo 
simulations.  In  [6],  Zhang  et  al.  considered  a  network  with 
ONs  Poissonly  distributed  and  INs  deterministically  placed  in 
a  given  2D  area.  They  studied  the  fraction  of  ONs  which  are 
connected  to  at  least  one  IN  in  at  most  k  hops.  In  [7],  Dousse  et 
al.  considered  a  network  where  ONs  are  Poissonly  distributed 
in  Ji2  and  INs  are  placed  on  a  grid.  Using  simulations, 
they  showed  that  an  increase  in  the  number  of  INs  provides 
little  improvement  on  the  probability  that  an  arbitrary  ON  is 
connected  to  at  least  one  IN,  when  the  density  of  ONs  is 
either  very  high  or  very  low.  However,  the  benefit  of  INs  for 
the  connectivity  of  networks  with  an  intermediate  density  of 
ONs  was  not  investigated.  Note  that  all  the  above  results  are 
for  static  networks. 

In  view  of  the  special  characteristics  of  dynamic  networks, 
epidemic  routing  algorithms  [2],  [4]  have  been  proposed  to 
disseminate  information  in  MANETs.  In  [8],  Zyba  et  al. 
studied  the  performance  of  a  Susceptible -Infectious  (SI)  epi¬ 
demic  routing  algorithm  using  real  world  mobility  traces.  By 
separating  users  into  two  behavioral  classes,  they  obtained  the 
successive  meeting  time  of  two  nodes  and  the  fraction  of  nodes 
that  receive  the  information  broadcast  from  an  arbitrary  source. 
In  [9],  Zhang  et  al.  studied  analytically  the  performance  of 
SIR  epidemic  routing  in  a  MANET  where  nodes  are  Poissonly 
distributed  on  a  torus  and  move  following  a  random  direction 
model.  They  studied  the  fraction  of  MNs  that  receive  the 
information  broadcast  from  an  arbitrary  MN.  However,  the 
aforementioned  studies  considered  only  ad-hoc  connection 
between  MNs  in  a  MANET  without  infrastructure  support. 

Time  delay  is  an  important  metric  for  both  end-to-end 
and  broadcast  information  disseminations.  In  [10],  Groenevelt 
et  al.  studied  the  end-to-end  information  dissemination  in  a 
MANET  using  an  unrestricted  multi-copy  protocol  like  the  SI 
epidemic  routing  algorithm.  Using  the  approximation  that  the 
successive  meeting  time  of  two  nodes  follows  an  exponential 
distribution,  they  showed  that  the  end-to-end  delay  between  an 
arbitrary  pair  of  nodes  converges  to  0  asymptotically 

as  the  number  of  nodes  N  — >  oo.  In  [11],  Zhang  et  al.  studied 
the  performance  of  several  epidemic  routing  algorithms  in 
MANETs  assuming  that  the  successive  meeting  time  of  two 
nodes  follows  an  exponential  distribution  with  rate  /3.  By 
solving  ordinary  differential  equations,  they  showed  a  similar 
result  to  that  in  [10],  i.e.  the  expected  end-to-end  delay 
between  an  arbitrary  pair  of  nodes  is  using  the  SI 

epidemic  routing  algorithm,  where  N  is  the  number  of  nodes 


in  the  network.  In  [12],  Zhou  et  al.  studied  the  performance 
of  SI  epidemic  routing  in  a  MANET  with  N  mobile  nodes 
and  M  base  stations.  At  each  time  slot,  a  single  pair  of 
transmitter-receiver  nodes  is  uniformly  and  randomly  selected 
among  N  +  M  nodes  in  the  network.  It  follows  that  the  rate  of 
transmission  is  once  per  time  slot,  irrespective  of  the  number 
of  MNs  in  the  network.  Modeling  the  information  propagation 
process  by  a  Markov  chain,  they  found  that  the  expected  time 
required  to  deliver  a  message  from  an  arbitrary  MN  to  at  least 
one  of  the  base  stations  is  Q(N\ogN).  The  analysis  in  this 
paper  is  not  based  on  the  assumption  on  the  inter-meeting 
time  or  the  rate  of  transmission.  Further,  we  show  the  impact 
of  various  parameters,  i.e.  node  density,  radio  range  and  a 
quality-of-service  metric  (fraction  of  nodes  that  receive  the 
information),  on  the  delay  of  the  information  dissemination  in 
an  infrastructure-based  MANET. 

III.  System  model 

A.  Network  model 

Consider  a  MANET  where  mobile  nodes  (MNs)  are  ran¬ 
domly  and  independently  distributed  on  a  torus  (0,L]2  [13] 
following  a  homogeneous  Poisson  point  process  with  inten¬ 
sity  A.  It  follows  that  the  expected  number  of  MNs  in  the 
network  is  N  =  XL2.  Further,  M  infrastructure  nodes  (INs) 
are  deterministically  placed  in  the  same  area,  in  a  way  that  they 
are  not  clustered  together,  as  shown  in  Fig.  1.  An  example  is 
to  place  the  INs  in  a  way  that  the  associated  Voronoi  cells 
of  these  INs  have  equal  size.  We  consider  that  the  number 
of  INs  is  much  less  than  the  number  of  MNs,  i.e.  M  <C  N, 
which  reflects  the  fact  that  INs  are  usually  more  expensive. 
Two  nodes  are  directly  connected  iff  (if  and  only  if)  their 
Euclidean  distance  is  smaller  than  or  equal  to  the  radio  range 
ro,  viz.  we  adopt  the  unit  disk  communication  model.  The 
use  of  torus  allows  a  node  located  near  the  boundary  to  have 
the  same  number  of  connections  probabilistically  as  a  node 
located  near  the  center.  Therefore,  the  torus  topology  is  a 
simplification  that  helps  to  obtain  analytical  results,  because 
there  is  no  need  to  consider  the  boundary  effect,  and  it  is 
commonly  used  in  this  area  [10],  [13]. 
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Fig.  1 .  Examples  of  the  deployment  of  one,  four  or  nine  infrastructure  nodes. 


Further,  MNs  move  according  to  the  random  walk  model 
(RWM)  [14],  following  which  each  MN  moves  at  a  constant 
speed  V  in  a  particular  direction  for  a  time  duration  following 
an  exponential  distribution  with  mean  £  seconds.  The  direction 
is  uniformly  chosen  in  [0,  2n),  independent  of  the  direction  of 
other  MNs  and  the  node’s  direction  in  other  time  intervals. 


Note  that  when  f  — >  oo,  i.e.  the  direction  does  not  change 
over  time,  and  our  analysis  therefore  provides  the  results  for  a 
MANET  with  the  random  direction  model  (RDM)  [14];  when 
f  — >  0  the  mobility  model  is  like  Brownian  motion;  when 
V  — >  0,  our  analysis  provides  the  results  for  a  static  network. 

It  is  worth  noting  that  under  the  aforementioned  node 
distribution  model  and  mobility  models,  the  spatial  distribution 
of  the  MNs  is  stationary  and  always  follows  a  homogeneous 
Poisson  point  process  with  intensity  A  at  any  time  instant  [15]. 

B.  Susceptible-Infectious-Recovered  (SIR)  routing  algorithm 

Consider  a  basic  stochastic  SIR  epidemic  routing  algorithm 
[9],  [16],  where  a  piece  of  information  is  broadcast  from  the 
INs  to  the  MNs  at  time  t  =  0.  We  consider  that  the  INs  are 
connected  to  each  other  by  a  backbone  network  so  that  INs 
can  broadcast  the  same  piece  of  information  at  the  same  time. 

By  analogy  to  the  way  a  disease  spreads  in  a  human 
network,  a  MN  can  be  in  any  of  three  states  S,  I,  R:  A  MN  that 
has  never  received  the  information  from  any  of  the  INs  is  in  the 
state  of  susceptible  (S),  in  which  the  MN  can  accept  incoming 
transmissions  if  such  opportunity  arises.  A  susceptible  node 
transits  into  the  state  of  infected  and  infectious  (I)  immediately 
after  it  has  received  a  copy  of  the  information.  A  node  in  state 
I  keeps  transmitting  the  information,  i.e.  remains  infectious, 
for  a  certain  time  period  r,  which  is  referred  to  as  the  active 
period.  Note  that  r  is  a  pre-determined  value  which  is  the 
same  for  all  nodes.  After  the  active  period,  the  node  enters 
into  the  state  of  recovered  and  immune  (R).  A  node  in  state 
R  stops  transmitting  the  information  to  other  nodes  and  will 
ignore  all  future  transmissions  of  the  same  information  from 
other  nodes.  An  animation  of  the  information  dissemination 
process  in  a  MANET  is  available  on  [17]. 

Note  that  the  INs  only  transmit  a  piece  of  information  for  a 
time  period  of  length  r,  after  which  the  information  propagates 
in  the  network  using  the  epidemic  routing  algorithm.  The 
information  dissemination  naturally  stops  (i.e.  reaches  the 
steady  state )  when  there  is  no  infectious  node  in  the  network. 
We  study  the  fraction  of  informed  MNs  when  the  information 
dissemination  process  has  reached  the  steady  state,  where  the 
informed  MNs  are  MNs  that  have  received  the  information. 

The  network  introduced  in  this  subsection  is  denoted  by 
G(L,  X,  M,V,  C,,t)  hereafter.  Further,  we  consider  that  the 
network  is  deployed  on  a  sufficiently  large  torus  (i.e.  L  >  Vt) 
such  that  within  the  active  period  r,  a  node  will  not  be  wrapped 
back  to  the  point  where  it  starts  moving  at  time  0. 

IV.  Analysis  of  the  information  dissemination 
process 

A.  The  definitive  metric 

The  information  dissemination  process  of  a  MANET  is 
determined  by  a  number  of  parameters  such  as  node  density, 
mobility,  radio  range  and  the  length  of  active  period.  In  this 
subsection,  a  single  metric  is  proposed,  which  captures  the 
impact  of  the  above  parameters. 

Definition  1.  The  effective  node  degree  Ra  of  an  infectious 
node  is  the  expected  number  of  MNs  that  have  been  inside 


an  infectious  node’s  radio  range  during  the  infectious  node’s 
active  period. 


Note  that  Rq  is  the  same  for  all  MNs  because  of  the 
stationarity  and  homogeneity  of  node  distribution  on  the  torus. 
Further,  we  do  not  need  to  consider  the  INs  in  the  calculation 
of  R0,  because  the  INs  are  the  sources  of  the  information  and 
they  do  not  act  as  a  relay  which  receives  an  information  then 
re-broadcast  it.  Next  we  provide  an  upper  bound  on  If . 


Lemma  1.  Consider  a  network  Q (L,  A,  M,  V,  f,  r).  The  effec¬ 
tive  node  degree  Rq  satisfies: 
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(1) 


Proof:  We  first  study  the  size  of  the  area  covered  by  the 
radio  range  of  a  typical  mobile  node  (Q)  during  its  active 
period  assuming  that  its  direction  does  not  change.  As  shown 
in  Fig.  2,  the  area  consists  of  two  parts.  The  first  part  is  the 
area  swept  by  a  segment  of  length  2ro  perpendicular  to  the 
trajectory  of  the  node,  which  is  shown  by  the  line-shaded  area 
and  is  referred  to  as  the  area  swept  by  the  radio  range.  The 
second  part  is  the  area  of  two  half-circles,  which  is  shown  by 
the  uniformly  shaded  area  and  whose  size  is  7rrg. 


Fig.  2.  An  illustration  of  the  area  (line-shaded  and  uniformly  shaded)  covered 
by  the  radio  range  of  a  node  when  the  node  moves  along  a  straight  line. 

Note  that  if  a  MN  changes  its  direction  over  time,  then 
this  will  cause  overlapping  of  the  trajectory  and  a  reduction 
in  the  size  of  the  area  swept  by  the  radio  range  during  its 
active  period.  However,  an  analytical  upper  bound  on  this 
area,  call  it  A(V),  can  be  obtained  by  ignoring  the  reduction 
of  the  area  caused  by  the  overlapping  of  the  trajectory.  It  is 
straightforward  that  A(V)  <  2r0Vr. 

Denote  by  0  the  angle  measured  counterclockwise  from  the 
direction  of  the  node  Q  to  the  direction  of  an  arbitrary  MN. 
Recall  that  the  direction  of  an  arbitrary  MN  is  randomly  and 
uniformly  chosen  in  [0,  2n),  independent  of  the  directions  of 
other  nodes.  It  can  be  shown  that  0  is  uniformly  distributed 
in  [0,  27t).  Based  on  the  thinning  property  of  a  Poisson 
process,  it  can  be  shown  that  the  subset  of  MNs  moving 
toward  directions  0  €  (0,0  +  dO )  follows  a  homogeneous 
Poisson  process  with  intensity  A  dO.  Further,  the  relative  speed 
between  node  Q  and  the  aforementioned  subset  of  MNs  is 
y/V2  +  V2  —  2V2  cos  0  =  2Vjsin||.  To  facilitate  the  anal¬ 
ysis,  we  consider  a  new  coordinate  system  where  the  origin 
is  set  at  an  arbitrary  node  moving  toward  direction  0.  In  this 
new  coordinate  system,  the  locations  of  the  aforementioned 
subset  of  MNs  are  approximately  stationary1  and  the  node 

1  The  locations  of  the  MNs  may  have  small  displacements  during  time 
interval  r  due  to  a  small  direction  difference  dO.  However  the  displacements 
become  vanishingly  small  when  dO  — »•  0  while  V  and  r  are  finite. 


Q  is  moving  at  speed  2V|sin||.  It  follows  that  in  the  new 
coordinate  system,  the  expected  size  of  area  swept  by  the 
radio  range  of  node  Q  is  A(2V\  sin  1 |).  Therefore  the  expected 
number  of  MNs  moving  toward  directions  0  €  (9,9  +  d9 ), 
that  have  been  inside  the  area  swept  by  the  radio  range  of 
node  Q  during  its  active  period,  is  A(2V |  sin  § \)d\9. 

Considering  all  mobile  nodes  in  the  network,  it  is  straight¬ 
forward  that  the  expected  number  of  MNs  that  have  been 
inside  the  area  swept  by  the  radio  range  of  node  Q  is: 
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Therefore,  the  expected  number  of  nodes  that  have  been 
inside  the  radio  range  of  node  S  during  its  active  period  is 


R  <  8roVr\ 

—  -7T 


B.  Percolation  probability 

In  this  subsection,  we  study  the  percolation  probability  of 
an  infrastructure-based  MANET  in  the  limit  of  large  network 
size,  where  in  the  limit  of  large  network  size  means  that  we 
increase  the  network  area  to  infinity  (i.e.  let  L  — >  oo)  while 
keeping  other  parameters,  such  as  A,  V  and  /■(,,  constant.  Note 
that  the  distance  between  INs  also  increases  with  L  such  that 
the  ratio  between  the  two  values  is  kept  constant  as  shown  in 
Fig.  1.  The  percolation  probability  is  defined  in  the  following: 

Definition  2.  The  percolation  probability  pc  of  an 
infrastmcture-based  MANET  is  the  probability  that  a  piece 
of  information  broadcast  from  the  INs  is  received  by  a  non- 
vanishingly-small  fraction  of  MNs  in  the  steady  state,  in  the 
limit  of  large  network  size. 

We  have  the  following  result  for  the  percolation  probability. 

Theorem  1.  Consider  a  network  Q(L,  X,  M,V,  f,r),  whose 
effective  node  degree  is  known  to  be  Rq.  The  percolation 
probability  is  pc  <  1  —  ( n  — —  )M,  where  W(.)  is  the 

Lambert  W  Function. 


Proof:  We  first  consider  the  information  dissemination 
process  of  a  single  IN,  in  which  a  node  (say  A)  is  a  child  of 
another  node  (say  B)  iff  node  A  (which  must  be  a  susceptible 
node)  receives  the  information  from  node  B  (which  must  be 
an  infectious  node).  Denote  by  xjy(fc)  the  number  of  nodes 
at  the  kth  generation  in  the  information  dissemination  process 
of  a  single  IN.  Then  xjv(^)  can  t>e  m°deled  by  a  branching 
process  whose  expected  number  of  children  per  node  is  smaller 
than  Ro,  because  of  the  reduction  of  the  fraction  of  susceptible 
nodes  in  the  network  as  the  information  propagates.  Next  we 
introduce  a  Galton- Watson  branching  process  [18],  in  which 
the  expected  number  of  children  per  node  is  Rq.  Denoted  by 
X1(fc)  the  number  of  individuals  at  the  kth  generation  of  the 
Galton- Watson  branching  process.  As  discussed  earlier,  we 


have  xjv(^)  —  X1(^)  i*1  stochastic  ordering2  for  all  k  >  1. 
Note  that  at  the  first  generation,  though  there  is  no  previous 
reduction  of  the  fraction  of  susceptible  nodes,  the  relation 
Xtv(1)  —  X1(l)  is  still  valid  because  an  IN  does  not  move 
over  time  so  that  the  expected  number  of  children  of  an  IN 
can  be  smaller  than  R0. 

Define  q  =  Pr(limfc_>,00  x1(^c)  — ^  0)  to  be  the  extinction 
probability  of  the  Galton- Watson  branching  process  X1(k). 
It  can  been  shown  (cf.  [18,  Theorem  6.5.1])  that  q  = 
where  W(.)  is  the  Lambert  W  Function  [19]. 
Further,  define  qN  =  Pr(limfe_),00  Xn(^)  0)  to  be  the 

extinction  probability  of  the  branching  process  Xn( &)■  Due 
to  the  stochastic  ordering  Xn(^)  <  X1(^c)>  we  have 

W(-R0e~R°) 


qw  >q  = 


— Rq 


(5) 


Recall  that  we  have  M  INs  in  the  network.  Denote  by 
p'c  the  probability  that  at  least  one  of  the  M  information 
dissemination  processes,  rooted  at  M  INs  respectively,  does 
not  become  extinct.  It  can  be  shown  that 


Pc  <  1  -  9n  <  1  -  Q 


M 


(6) 


where  the  second  inequality  is  due  to  Eq.  5  and  the  first 
inequality  is  due  a  spatial  correlation  between  INs.  The  spatial 
correlation  between  INs  arises  if  the  Euclidean  distance  be¬ 
tween  two  INs  is  small,  for  then  two  information  dissemination 
processes  may  share  a  common  set  of  susceptible  nodes.  As 
the  information  propagates,  the  reduction  of  the  fraction  of 
susceptible  nodes  caused  by  the  information  dissemination 
process  of  one  IN  can  reduce  the  expected  number  of  children 
per  node  in  the  information  dissemination  process  of  another 
IN.  Therefore  we  have  p'c<  1  —  q((! . 

Further,  the  event  of  having  an  unbounded  number  of 
informed  MNs  in  the  limit  of  large  network  size  (whose 
probability  is  p'c)  is  a  necessary  condition  for  having  a  non- 
vanishingly-small  fraction  of  MNs  in  the  limit  of  large  network 
size.  Therefore  we  have  pc  <  p'c.  ■ 

Remark  1.  To  improve  the  percolation  probability  of  a  given 
network  Q(L,  A,  M,V,  (,t),  one  can  reduce  the  spatial  cor¬ 
relation  between  INs  by  reducing  the  number  of  susceptible 
nodes  shared  by  the  INs.  Therefore  we  consider  placing  the 
INs  in  a  way  that  the  associated  Voronoi  cells  have  equal  size, 
as  introduced  in  Section  III. 

Remark  2.  The  above  results  suggest  that  an  increase  in  the 
number  of  INs  provides  little  improvement  on  the  percolation 
probability  when  the  density  of  MNs  is  either  very  low  (such 
that  q  =  1)  or  very  high  (such  that  q  =  0).  This  result  coincides 
with  the  result  for  a  static  network  [7].  However,  our  results 
further  suggest  that  when  the  density  of  MNs  is  intermediate, 
an  increase  in  the  number  of  INs  can  improve  the  percolation 
probability  significantly,  which  is  further  verified  in  Section  V. 

In  addition,  it  can  be  shown  that  1  —  ( — ^  R<^0 — ^)M  is 
0  for  0  <  i?o  <  1  and  it  is  monotonically  increasing  with 


2Using  stochastic  ordering,  we  say  x]v(^)  —  X1  W  iff  Pr(x]v(^)  >  cl)  < 
Pr(x1(^)  >  cl)  for  any  number  a. 


respect  to  i?0  for  R0  >  1.  Therefore  the  following  corollary 
can  be  readily  obtained  using  Lemma  1  and  Theorem  1. 


Corollary  2.  Consider  a  network  G{L,  A,  M,  V,  (,  r).  The 
percolation  probability  is 


Pc  <  1  -  ( 


W(-( 


8rpV\T 


0  8rn  V  At  2\. 

7T7QA)e  +  7rT°A) 


M 


_8r„VAr  _  ^  X 

7T  C 


where  W{.)  is  the  Lambert  W  Function. 


The  percolation  probability  provides  the  probability  that  a 
piece  of  information  can  be  spread  out  to  a  significant  fraction 
of  MNs.  In  the  next  subsection,  we  quantify  how  many  MNs 
can  receive  the  information. 


C.  Expected  fraction  of  informed  MNs 

Recall  that  a  piece  of  information  broadcast  from  INs  is 
said  to  percolate  if  the  information  is  received  by  a  non- 
vanishingly-small  fraction  of  MNs  in  the  steady  state. 

Define  zp  to  be  the  expected  fraction  of  informed  MNs  in 
the  steady  state  given  that  the  information  percolates.  Because 
M  -C  N,  the  following  result  can  be  readily  obtained  based 
on  Lemma  3  in  [9]. 

Theorem  2.  Consider  a  network  G(L,  A,  M,V,  £,  t),  whose 
effective  node  degree  is  known  to  be  Rq.  If  the  information 
percolates,  then  the  expected  fraction  of  informed  MNs  in  the 
steady  state  satisfies  zp  <  1  +  -^-W(— Roe~R°),  where  W{.) 
is  the  Lambert  W  Function. 

It  is  further  shown  in  Section  V  that  increasing  the  number 
of  INs  has  little  impact  on  zp.  On  the  other  hand,  increasing 
the  number  of  INs  can  reduce  the  delay  of  the  information 
dissemination  process,  which  is  studied  in  the  next  subsection. 

D.  Information  dissemination  delay 

Consider  that  a  piece  of  information  is  broadcast  from  the 
INs  at  time  t  =  0  using  the  SIR  epidemic  routing  algorithm. 
Let  T(z)  be  the  expected  time  when  the  fraction  of  informed 
MNs  reaches  z,  for  0  <  z  <  1 .  In  this  subsection,  we  provide 
a  lower  bound  on  the  delay  T{z)  by  considering  a  SI  in  lieu 
of  SIR  epidemic  routing  algorithm. 

It  is  worth  noting  that  in  a  multi-hop  network,  a  piece 
of  information  can  be  forwarded  several  hops  in  a  short 
time  interval  if  a  journey  exists  in  such  time  interval.  The 
multi-hop  forwarding  is  usually  much  faster  than  the  store- 
carry-forward  method  because  the  latter  method  relies  on 
the  physical  movement  of  MNs.  Therefore  the  time  delay 
of  the  information  dissemination  process  allowing  multi-hop 
forwarding  is  different  and  more  realistic  than  that  in  the  pre¬ 
vious  research  of  epidemics  or  MANETs  based  on  a  particular 
inter-meeting  time  or  rate  of  transmission  [11],  [12],  [16],  as 
discussed  in  Section  II.  We  take  the  multi-hop  forwarding  into 
consideration  and  provide  the  following  result. 

Theorem  3.  Consider  a  network  G(L,  A,  M,  V.  <j,  r).  A  piece 
of  information  is  broadcast  from  the  INs  at  time  t  =  0  using 
the  SIR  epidemic  routing  algorithm.  Let  T(z)  be  the  expected 


time  when  the  fraction  of  informed  MNs  reaches  z,  for  0  < 
z  <  1.  Then  T(z)  >  Ro+NZs  In  ,  where  zs  =  (1  + 

^^2  VL(— A7rrge_A’rr'o))2  and  N  =  XL2. 

Proof:  Firstly  we  consider  the  multi-hop  forwarding. 
Let  zm  be  the  expected  fraction  of  MNs  that  receive  the 
information  broadcast  from  an  arbitrary  node  at  an  arbitrary 
time  instant.  By  letting  V  =  0,  Lemma  1  gives  R0  <  Xirr^. 
Then  using  Corollary  2  and  Theorem  2,  it  can  be  shown  that 

zm<(  1  +  j^W(-X7rr2e-x^o))2  4  zs. 

It  is  straightforward  that  the  recovery  mechanism  reduces 
the  number  of  infectious  nodes  hence  slows  down  the  in¬ 
formation  dissemination  process.  Therefore  we  can  obtain  a 
lower  bound  on  T(z)  by  ignoring  the  recovery  mechanism: 
viz.  considering  a  SI  epidemic  spreading  where  the  expected 
number  of  nodes  that  come  into  the  radio  range  of  an  arbitrary 
infectious  node  in  a  unit  time  interval  is  a  =  Ro+Nzs .  Let 

T 

I(t)  be  the  number  of  infectious  nodes  at  time  t,  then  it  can 
be  shown  using  the  same  method  as  that  in  Lemma  3  of  [9]: 


/'(f)  =  a(l-^)/(f) 

Solving  the  above  ODE,  we  have: 
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Letting  z  =  the  above  equation  can  be  re-written  as: 
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As  discussed  earlier,  we  have  T(z)  >  t(z).  ■ 

Note  that  Theorem  3  can  serve  as  a  lower  bound  for 
other  information  dissemination  methods  for  infrastructure- 
based  MANETs,  because  the  SI  epidemic  routing  algorithm, 
like  flooding,  has  been  shown  to  have  the  lowest  delay  for 
information  dissemination  in  MANETs  [4],  [10],  [11]. 


V.  Simulation  results 

In  this  section,  we  report  on  simulations  to  verify  the  accu¬ 
racy  of  the  analytical  results.  The  simulations  are  conducted 
using  a  MANET  simulator  written  in  C++.  MNs  are  deployed 
on  a  torus  (0, 800]2  following  a  homogeneous  Poisson  process 
with  intensity  A  =  0.002  users/m2,  which  is  the  population 
density  in  Sydney  city  [20],  Speed  V  is  1.5m/s  (typical 
human  walking  speed  [21])  or  lOm/s  (typical  vehicle  moving 
speed  [22]).  The  radio  range  r o  is  varied  from  1  to  20.  (The 
radio  range  of  a  Bluetooth  device  is  around  10m  [23].)  The 
active  period  r  is  set  to  be  100  or  200,  where  results  using 
other  values  of  r  show  a  similar  trend.  The  INs  are  placed 
deterministically  as  shown  in  Fig.  1  and  it  remains  our  future 


work  to  find  the  optimal  placement  of  INs  for  the  information 
dissemination  in  MANETs.  Every  point  shown  in  the  figures  is 
the  average  value  from  500  simulations,  where  the  confidence 
interval  is  too  small  to  be  distinguishable  and  so  is  omitted. 

In  Fig.  3,  the  simulation  result  for  percolation  probability 
shows  the  probability  that  at  least  5%  of  MNs  receive  the 
information  broadcast  from  INs.  Firstly,  it  can  be  seen  in  Fig. 
3  that  the  analytical  result  well  captures  the  impact  of  various 
parameters,  e.g.  nodal  speed,  radio  range  and  the  number  of 
INs  on  the  information  dissemination  process.  Further,  it  can 
be  seen  in  Fig.  3  (a)  that  a  significant  higher  percolation 
probability  can  be  achieved  by  a  small  number  of  additional 
INs  (e.g.  from  1  IN  to  4  INs).  On  the  other  hand,  it  can  be 
seen  in  Fig.  3  (b)  that  the  introduction  of  more  INs  cannot 
improve  the  expected  fraction  of  informed  nodes  in  the  steady 
state  if  the  information  percolates.  This  observation  confirms 
our  assertion  given  in  Section  IV-C.  Further  study  reveals 
that  if  a  MN  (or  a  group  of  MNs)  does  not  receive  the 
information  given  that  the  information  percolates  elsewhere 
in  the  network,  the  MN  (or  the  group  of  MNs)  is  usually 
geometrically  separated  from  the  informed  nodes.  However, 
a  small  number  of  additional  INs  has  a  small  probability  of 
bridging  the  geometric  gap. 


Fig.  3.  Analytical  (Ana)  and  simulation  (Sim)  results  for  (a)  Percolation  prob¬ 
ability;  (b)  Expected  fraction  of  informed  MNs  if  the  information  percolates. 


Fig.  4  shows  the  time  delay  for  a  piece  of  information  to 
be  received  by  50%  of  MNs.  It  can  be  seen  that  our  analytical 
result  captures  the  impact  on  the  delay  of  network  parameters 
such  as  radio  range,  nodal  speed  or  the  number  of  INs. 
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Fig.  4.  Analytical  (Ana)  and  simulation  (Sim)  results  for  the  time  delay  for 
a  piece  of  information  to  be  received  by  50%  of  MNs.  The  curves  for  V=10 
overlap  in  (a),  but  the  accuracy  of  the  results  for  V=10  can  be  seen  in  (b). 


VI.  Conclusions  and  future  work 
In  this  paper,  we  study  the  dissemination  of  a  piece  of 
information  broadcast  from  INs  in  an  infrastructure-based 


MANET  using  a  SIR  epidemic  routing  algorithm.  Analytical 
results  are  derived  for  the  percolation  probability,  the  expected 
fraction  of  informed  MNs  and  the  time  delay.  The  accuracy 
of  the  analytical  results  is  verified  using  simulations.  In  the 
future,  we  intend  to  study  an  infrastructure-based  MANET 
driven  by  a  real  world  trace. 

References 

[1]  E.  Daly  and  M.  Haahr,  “Social  network  analysis  for  routing  in  discon¬ 
nected  delay-tolerant  MANETs,”  in  Proceedings  of  MobiHoc,  2007,  pp. 
32-40. 

[2]  A.  Vahdat  and  D.  Becker,  “Epidemic  routing  for  partially-connected  ad 
hoc  networks,”  Duke  Tech  Report  CS-2000-06,  2000. 

[3]  G.  Mao  and  B.  D.  O.  Anderson,  “Graph  theoretic  models  and  tools  for 
the  analysis  of  dynamic  wireless  multihop  networks,”  in  IEEE  WCNC, 
2009,  pp.  2828-2833. 

[4]  R.  J.  D’ Souza  and  J.  Jose,  “Routing  approaches  in  delay  tolerant 
networks:  A  survey,”  International  Journal  of  Computer  Applications , 
vol.  1,  no.  17,  pp.  8-14,  2010. 

[5]  S.  A.  Bermudez  and  S.  B.  Wicker,  “Partial  connectivity  of  multi¬ 
hop  two-dimensional  finite  hybrid  wireless  networks,”  in  Proc.  IEEE 
International  Conference  on  Communications,  ICC ,  2010. 

[6]  Z.  Zhang,  S.  C.  Ng,  G.  Mao,  and  B.  D.  O.  Anderson,  “On  the  k-hop 
partial  connectivity  in  finite  wireless  multi-hop  networks,”  in  IEEE  ICC , 
2011. 

[7]  O.  Dousse,  P.  Thiran,  and  M.  Hasler,  “Connectivity  in  ad-hoc  and  hybrid 
networks,”  in  Proceedings.  IEEE  INFOCOM,  vol.  2,  2002,  pp.  1079- 
1088. 

[8]  G.  Zyba,  G.  M.  Voelker,  S.  Ioannidis,  and  C.  Diot,  “Dissemination  in 
opportunistic  mobile  ad-hoc  networks:  The  power  of  the  crowd,”  in 
Proceedings  IEEE  INFOCOM ,  2011,  pp.  1179-1187. 

[9]  Z.  Zhang,  G.  Mao,  and  B.  D.  O.  Anderson,  “On  the  information  propa¬ 
gation  in  mobile  ad-hoc  networks  using  epidemic  routing,”  accepted  by 
Globecom201 1 ,  pp.  1-6,  2011. 

[10]  R.  Groenevelt,  P.  Nain,  and  G.  Koole,  “The  message  delay  in  mobile 
ad  hoc  networks,”  Performance  Evaluation,  vol.  62,  no.  1-4,  pp.  210  — 
228,  2005. 

[11]  X.  Zhang,  G.  Neglia,  J.  Kurose,  and  D.  Towsley,  “Performance  modeling 
of  epidemic  routing,”  Computer  Networks:  The  International  Journal 
of  Computer  and  Telecommunications  Networking,  vol.  51,  no.  10,  pp. 
2867-2891,  2007. 

[12]  S.  Zhou,  L.  Ying,  and  S.  Tirthapura,  “Delay,  cost  and  infrastructure 
tradeoff  of  epidemic  routing  in  mobile  sensor  networks,”  in  Proceedings 
of  the  6th  International  Wireless  Communications  and  Mobile  Comput¬ 
ing  Conference,  2010,  pp.  1242-1246. 

[13]  M.  Franceschetti  and  R.  Meester,  Random  networks  for  communication: 
From  Statistical  Physics  to  Information  Systems.  Cambridge  University 
Press,  2007. 

[14]  T.  Camp,  J.  Boleng,  and  V.  Davies,  “A  survey  of  mobility  models 
for  ad  hoc  network  research,”  Wireless  Communications  and  Mobile 
Computing,  vol.  2,  no.  5,  pp.  483  -  502,  2002. 

[15]  P.  Nain,  D.  Towsley,  B.  Liu,  and  Z.  Liu,  “Properties  of  random  direction 
models,”  in  INFOCOM  2005.  24th  Annual  Joint  Conference  of  the  IEEE 
Computer  and  Communications  Societies,  vol.  3,  2005,  pp.  1897-  1907. 

[16]  T.  Britton,  “Stochastic  epidemic  models:  A  survey,”  Mathematical 
Biosciences,  vol.  225,  no.  1,  pp.  24-35,  2010. 

[17]  Z.  Zhang,  “Mobile  ad-hoc  networks,”  2011.  [Online].  Available: 
http://zijie.net/manet/ 

[18]  P.  Jagers,  Branching  processes  with  biological  applications.  Wiley, 
1975. 

[19]  R.  M.  Corless,  G.  H.  Gonnet,  D.  E.  G.  Hare,  D.  J.  Jeffrey,  and  D.  E. 
Knuth,  “On  the  Lambert  W  Function,”  Advances  in  Computational 
Mathematics,  vol.  5,  pp.  329-359,  1996. 

[20]  Demographia,  “World  urban  areas:  7th  annual  edition,”  2011.  [Online]. 
Available:  www.demographia.com/db-worldua.pdf 

[21]  N.  Carey,  “Establishing  pedestrian  walking  speeds,”  Karen  Aspelin, 
Portland  State  University,  2005. 

[22]  M.  Rudack,  M.  Meincke,  and  M.  Lott,  “On  the  dynamics  of  ad  hoc 
networks  for  inter  vehicle  communications  (IVC) ,”  in  Proc.  ICWN,  2002. 

[23]  A.-K.  Pietilinen  and  C.  Diot,  “Experimenting  with  opportunistic  net¬ 
working,”  in  Proc.  of  the  ACM  MobiArch  Workshop,  2009. 


1 


On  the  Hop  Count  Statistics  in  Wireless 
Multi-hop  Networks  Subject  to  Fading 
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Abstract — Consider  a  wireless  multi-hop  network  where  nodes  are  randomly  distributed  in  a  given  area  following  a  homogeneous 
Poisson  process.  The  hop  count  statistics,  viz  the  probabilities  related  to  the  number  of  hops  between  two  nodes,  are  important  for 
performance  analysis  of  the  multi-hop  networks.  In  this  paper,  we  provide  analytical  results  on  the  probability  that  two  nodes  separated 
by  a  known  Euclidean  distance  are  k  hops  apart  in  networks  subject  to  both  shadowing  and  small-scale  fading.  Some  interesting 
results  are  derived  which  have  generic  significance.  For  example,  it  is  shown  that  the  locations  of  nodes  three  or  more  hops  away 
provide  little  information  in  determining  the  relationship  of  a  node  with  other  nodes  in  the  network.  This  observation  is  useful  for  the 
design  of  distributed  routing,  localization  and  network  security  algorithms.  As  an  illustration  of  the  application  of  our  results,  we  derive 
the  effective  energy  consumption  per  successfully  transmitted  packet  in  end-to-end  packet  transmissions.  We  show  that  there  exists  an 
optimum  transmission  range  which  minimizes  the  effective  energy  consumption.  The  results  provide  useful  guidelines  on  the  design  of 
a  randomly  deployed  network  in  a  more  realistic  radio  environment. 

Index  Terms — hop  count,  log-normal  shadowing,  Nakagami-m  fading,  energy  consumption,  wireless  multi-hop  networks. 

-  ♦  - 


1  Introduction 

A  wireless  multi-hop  network  consists  of  a  group  of 
nodes  that  communicate  with  each  other  over  wireless 
channels.  The  nodes  in  such  a  network  operate  in  a 
decentralized  and  self-organized  manner  and  each  node 
can  act  as  a  relay  to  forward  information  toward  the 
destination.  Wireless  multi-hop  networks  have  large  po¬ 
tential  in  military  and  civilian  applications  [1],  [2]. 

There  are  three  related  probabilities  characterizing  the 
connectivity  properties  of  such  a  multi-hop  network. 
These  are  the  probability  that  an  arbitrary  node  is  k  hops 
apart  from  another  arbitrary  node,  denoted  by  Pr(fc);  the 
probability  that  a  node  at  an  Euclidean  distance  x  apart 
from  another  node  is  connected  to  that  node  in  exactly  k 
hops,  denoted  by  Pr(fc|x);  and  the  spatial  distribution  of 
the  nodes  k  hops  apart  from  a  designated  node,  denoted 
by  Pr(a;|fc).  These  three  probabilities  are  related  through 
Bayes'  formula  and  if  one  is  computable,  the  other  two 
will  be  computable  using  similar  techniques.  Therefore 
we  call  these  problems  collectively  the  hop  count  statistics 
problems.  We  refer  readers  to  Section  9  in  the  supple- 
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mentary  material  for  an  extensive  discussion  on  the 
use  of  the  three  probabilities  in  various  applications. 

A  major  technical  obstacle  in  the  analysis  of  hop 
counts  statistics  is  the  so-called  spatial  dependence  problem 
[3],  [4],  The  spatial  dependence  problem  arises  because 
in  a  wireless  multi-hop  network  the  event  that  a  ran¬ 
domly  chosen  node  is  k  hops  apart  from  a  particu¬ 
lar  node  is  not  independent  of  the  event  that  another 
randomly  chosen  node  is  i  hops  apart  from  the  same 
node  for  i  <  k.  It  follows  that  an  accurate  analysis  on 
the  conditional  probability  Pr(lc|a;)  needs  to  consider  all 
previous  hops,  which  makes  the  analysis  complicated. 
This  technical  hurdle  caused  by  the  spatial  dependence 
problem  is  recognized  in  the  literature  but  remains 
unsolved  [4].  In  this  paper,  a  significant  improvement 
on  the  accuracy  of  computing  Vv{k\x)  is  achieved  by 
considering  the  positions  of  previous  two  hop  nodes, 
compared  to  the  results  considering  the  positions  of 
previous  one  hop  nodes  only.  Further,  we  show  that 
considering  the  positions  of  previous  two  hops  nodes 
is  enough  to  provide  an  accurate  estimate  of  Pr(fc|;r).  A 
detailed  explanation  of  the  spatial  dependence  problem 
is  given  in  Section  3.1. 

Further,  much  previous  research  (e.g.  [5]— [8])  estab¬ 
lishes  results  based  on  the  assumption  that  the  network 
is  connected,  i.e.  there  is  at  least  one  path  between  every 
pair  of  nodes.  However  in  many  wireless  multi-hop 
network  applications,  it  is  not  only  impractical  (due  to 
the  randomness  of  node  deployment,  a  complex  radio 
environment,  or  the  high  node  density  required  for  a 
large  scale  network  to  be  connected  [9])  but  also  unneces¬ 
sary  to  require  every  node  to  be  connected  to  every  other 
node.  For  example,  in  applications  which  are  not  life- 
critical,  e.g.  habitat  monitoring  or  environmental  mon- 
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itoring,  having  a  few  disconnected  source-destination 
pairs  will  not  cause  statistically  significant  change  in 
the  monitored  parameters  [10].  In  addition,  there  is  a 
downside  on  the  capacity  to  have  a  high  node  density 
required  for  a  connected  network.  Gupta  and  Kumar 
[11]  showed  that  in  a  connected  network,  as  the  num¬ 
ber  of  nodes  per  unit  area  n  increases,  the  throughput 
per  source-destination  pair  decreases  approximately  as 
1/y/n.  It  was  further  pointed  out  in  [12]  that  significant 
energy  savings  can  be  achieved  by  requiring  most  nodes 
but  not  all  nodes  to  be  connected.  Therefore  we  consider 
the  more  realistic  and  practical  scenario  that  the  network 
is  not  necessarily  connected.  Previous  results  established 
on  the  basis  of  a  connected  network  actually  form  special 
cases  of  the  problem  examined  in  this  paper. 

The  main  contributions  of  this  paper  are: 

•  Firstly,  in  a  network  with  nodes  distributed  in  a  fi¬ 
nite  area  following  a  homogeneous  Poisson  process, 
we  derive  the  probability  that  two  nodes  separated 
by  a  known  Euclidean  distance  x  are  k  hops  apart, 
i.e.  Pr(fc|a;),  considering  both  shadowing  and  small- 
scale  fading  and  using  a  distributed  routing  algo¬ 
rithm,  i.e.  greedy  forwarding; 

•  Secondly,  we  analyze  the  impact  of  the  spatial  de¬ 
pendence  problem  on  the  accuracy  of  Pr(7,:j.x'); 

•  Thirdly,  considering  a  sparse  network  in  which  there 
is  not  necessarily  a  path  between  any  pair  of  nodes, 
we  derive  the  probability  distribution  of  the  number 
of  hops  traversed  by  packets  before  being  dropped 
if  the  transmission  is  unsuccessful.  (An  end-to-end 
transmission  is  unsuccessful  if  the  packet  sent  from 
a  source  to  a  destination  has  to  be  dropped  at  an 
intermediate  node  because  it  is  unable  to  find  a  next- 
hop  node.) 

•  As  an  application  of  the  results,  we  derive  the 
effective  energy  consumption  per  successfully  trans¬ 
mitted  packet  in  end-to-end  packet  transmissions. 
We  show  that  there  exists  an  optimum  transmission 
range  which  minimizes  the  effective  energy  con¬ 
sumption.  Further,  the  impacts  of  unreliable  link, 
node  density  and  path  loss  exponent  on  the  energy 
consumption  is  included  in  the  analysis. 

The  results  in  this  paper  provide  a  more  complete 
understanding  on  the  properties  of  wireless  multi-hop 
networks  in  a  more  realistic  and  practical  setting. 

The  rest  of  this  paper  is  organized  as  follows:  Section 
8,  in  the  supplementary  material,  reviews  the  related 
work.  Section  2  introduces  the  network  models  and  some 
definitions.  The  analysis  of  the  hop  count  statistics  and 
the  end-to-end  energy  consumption  under  the  unit  disk 
communication  model  is  given  in  Section  3,  followed  by 
the  analysis  on  the  impact  of  the  spatial  dependence 
problem  on  the  accuracy  of  the  hop  count  statistics  in 
Section  4.  In  Section  5,  we  further  include  the  log-normal 
shadowing  and  small-scale  fading  in  the  analysis.  The 
simulation  results  and  discussions  are  given  in  Section 
6.  Finally  Section  7  concludes  this  paper  and  proposes 


possible  future  work. 


2  System  model 

2.1  Network  model 


In  this  paper,  we  consider  a  wireless  multi-hop  net¬ 
work  where  nodes  are  identically  and  independently  dis¬ 
tributed  (i.i.d.)  in  a  square  according  to  a  homogeneous 
Poisson  point  process  with  a  known  intensity  p. 

We  consider  that  every  node  has  the  same  transmis¬ 
sion  power.  The  simplest  radio  propagation  model  is  the 
unit  disk  communication  model.  Under  the  unit  disk 
model,  the  power  attenuates  with  the  Euclidean  distance 
x  from  a  transmitter  like  x~v,  where  //  is  the  path  loss 
exponent.  The  path  loss  exponent  can  vary  from  2  in 
free  space  to  6  in  urban  areas  [18].  The  received  signal 
strength  (RSS)  at  a  receiver  separated  by  Euclidean  dis¬ 
tance  x  from  the  transmitter  is  Pu(x)  =  CPtX~v ,  where 
C  is  a  constant,  Pt  is  the  transmission  power.  A  trans¬ 
mission  is  successful  iff  the  RSS  exceeds  a  given  thresh¬ 
old  Pm in.  Therefore  the  required  transmission  power  Pt 
allowing  a  transmission  range  ro  is  Pt  =  C\  r[[,  where 
Cl  =  Pmin/C. 

The  unit  disk  model  is  simple  but  unrealistic.  In  reality 
the  RSS  may  have  significant  variations  around  the  mean 
value,  because  of  both  large  scale  variation  (i.e.  shadow¬ 
ing)  and  small-scale  fading.  Considering  a  typical  type 
of  shadowing,  i.e.  the  log-normal  shadozving  model  [18]  the 
RSS  attenuation  (in  dB)  follows  a  normal  distribution 
with  respect  to  the  distance  x  between  transmitter  and 
receiver:  101og10(P;(a:)/C'Pta:_I?)  ~  Z,  where  Pfx)  is  the 
RSS  in  the  log-normal  shadowing  model  and  Z  is  a  zero- 
mean  Gaussian  distributed  random  variable  with  stan¬ 
dard  deviation  a.  When  a  =  0  the  model  reduces  to  the 
unit  disk  model.  In  practice  the  value  of  er  is  often  com¬ 
puted  from  measured  data  and  can  be  as  large  as  12  [18]. 
Denote  by  q(z)  the  pdf  (probability  density  function) 
of  the  shadowing  fades;  then:  q(z)  =  a^7T  exP(~^2  )• 
As  widely  used  in  the  literature  [18],  [20],  [25],  we 
assume  that  the  shadowing  fades  Z  between  all  pairs 
of  transmitting  node  and  receiving  node  are  i.i.d.  and 
the  link  is  symmetric.  The  limitation  of  the  assumption 
of  the  independence  between  links  is  discussed  in 
Section  10  in  the  supplementary  material. 

Shadowing  makes  the  RSS  vary  around  its  mean  value 
over  space,  while  the  small-scale  fading  makes  the  RSS 
vary  around  its  mean  value  over  time.  In  this  paper, 
we  consider  a  generic  model  of  small-scale  fading,  i.e. 
the  Nakagami-m  fading  [26].  By  choosing  different  values 
for  the  parameter  to  in  the  Nakagami-m  fading  model, 
the  results  easily  include  several  widely  used  fading 
distributions  as  special  cases,  e.g.  Rayleigh  distribution 
(by  setting  m  =  1)  and  one-sided  Gaussian  distribution 
(by  setting  to  =  1/2)  [26].  Subject  to  Nakagami-m  fading, 
the  RSS  per  symbol,  w,  is  distributed  according  to  a 
Gamma  distribution  given  by  the  following  pdf  [26]: 
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where  I  f.)  is  the  standard  Gamma  function  and  Co  = 
Pi(x)  is  the  mean  RSS  (over  time),  which  is  determined 
by  path  loss  and  shadowing. 

We  firstly  conduct  the  analysis  in  the  unit  disk  model, 
then  we  introduce  the  analysis  in  the  more  realistic 
Log-normal-Nakagami  model,  which  takes  into  account 
statistical  variations  of  RSS  around  the  mean  value  due 
to  both  log-normal  shadowing  and  Nakagami-m  fading. 

2.2  Per  hop  energy  consumption 

Assume  that  the  time  spent  on  transmitting  a  packet 
of  unit  size  over  a  single  hop  is  a  constant  Tt,  and 
all  nodes  transmit  at  the  same  power  Pt  which  results 
in  a  transmission  range  (without  shadowing)  of  ?’o  in 
the  unit  disk  model.  Therefore  the  energy  consumed  in 
transmitting  a  packet  over  a  single  hop  is: 

TtPt  +  Engc  C2rv0  +  Engc 

Eng(r0)  =  — - r—  =  — - r—  (2) 

l-a(r0)  l-a(r0) 

where  C2  =  C\  7)  is  a  constant,  Eng,  is  another  con¬ 
stant  which  includes  the  processing  power  consumption 
and  receiving  power  consumption  in  each  node  and 

a(r0)  =  (W+™+%wNb  is  the  Packet  error  rate  I27!/ 
Wmin  is  the  minimum  contention  window  size  and  Nt, 
is  the  average  node  degree:  Nt,  =  p7rrg.  Packet  colli¬ 
sion  can  increase  the  energy  consumption,  due  to  the 
consequent  re-transmission  of  a  packet,  especially  when 
the  transmission  range  is  large.  To  illustrate  this  effect, 
we  implemented  simulations  (in  Section  6)  using  the 
parameters  shown  in  [27],  i.e.  Wmin  =  64.  The  values  of 
C'2  and  Engc  are  dependent  on  hardware  specifications, 
where  some  typical  values  can  be  found  in  [28]. 

2.3  Routing  algorithm 

In  addition  to  the  impact  of  fading  on  the  wireless 
channel  between  two  nodes,  fading  also  affects  the  per¬ 
formance  of  higher  layer  protocols.  The  impact  of  fading 
on  higher  layer  protocols  remains  to  be  fully  investigated 
[22].  In  this  paper,  we  consider  the  cross-layer  issues 
by  analyzing  the  performance  of  a  wireless  multi-hop 
network  using  the  greedy  forwarding  routing  algorithm, 
as  a  typical  example  of  distributed  routing  algorithms. 
The  GF  routing  belongs  to  the  category  of  geographic 
routing  algorithms  and  is  a  widely  used  routing  algo¬ 
rithm  for  wireless  multi-hop  networks.  Using  GF,  each 
node  makes  routing  decisions  independently  of  other 
nodes  by  using  its  own  location  information,  the  location 
information  of  its  neighboring  nodes  and  the  locations 
of  the  source  and  the  destination.  GF  has  shown  great 
potential  in  wireless  multi-hop  networks  because  of  its 
distributed  nature,  low  control  overhead  and  capability 
of  adapting  to  dynamic  network  topologies  [14].  The  area 
has  attracted  significant  research  interest,  e.g.  [5]— [8]. 

We  consider  a  basic  GF  algorithm  that  operates  fol¬ 
lowing  two  rules  [7]:  1)  Every  node  tries  to  forward  the 
packet  to  the  node  within  its  transmission  range  which 
is  closest  to  the  destination.  2)  A  packet  will  be  dropped 


if  a  node  cannot  find  a  next-hop  neighbor  that  is  closer 
to  the  destination  than  itself,  and  hence  the  transmission 
becomes  unsuccessful.  Moreover  in  the  case  of  ties,  viz. 
more  than  one  node  have  the  same  Euclidean  distance 
to  the  destination,  an  arbitrary  one  of  those  nodes  can 
be  chosen  as  the  next  hop  node  without  affecting  the 
results  of  our  analysis.  This  is  because  the  way  to 
settle  ties  does  not  affect  the  probability  distribution 
of  the  remaining  distance  to  the  destination  at  each 
hop,  which  is  the  quantity  used  to  derive  our  results 
as  shown  in  Section  3.2. 

Note  that  a  number  of  complicated  recovery  algo¬ 
rithms  have  been  proposed  to  route  a  packet  around  the 
routing  void  [29].  The  quality  of  the  path  established 
by  a  greedy  forwarding  algorithm  can  be  measured 
by  the  stretch  factor,  which  quantifies  the  difference 
between  a  particular  path  and  the  shortest  path  [29].  By 
studying  the  stretch  factor,  it  is  shown  in  [29]  that  a  basic 
GF  algorithm  can  successfully  find  short  routing  paths 
in  sensing-covered  networks,  without  complex  recovery 
algorithms.  For  analytical  tractability  and  generality  of 
the  results,  we  consider  the  basic  greedy  forwarding 
algorithm  without  any  recovery  algorithm,  as  in  [5] — [8]. 

2.4  Definitions  of  some  terms 

Denote  by  Es[ks\xo]  the  expected  number  of  hops  for 
a  packet  to  reach  the  destination,  conditioned  on  the 
Euclidean  distance  between  source  and  destination  being 
X()  and  the  transmission  being  successful.  For  conve¬ 
nience,  throughout  this  paper  we  use  conditioned  on  xo 
for  conditioned  on  the  Euclidean  distance  between  source  and 
destination  being  x0.  Denote  by  Eu[ku\x0 ]  the  expected 
number  of  hops  traversed  by  a  packet  before  it  is 
dropped  due  to  the  nonexistence  of  a  next  hop  node 
closer  to  the  destination,  conditioned  on  Xo  and  the  trans¬ 
mission  being  unsuccessful  (in  this  case  Xq  is  the  distance 
between  the  source  and  the  intended  destination). 

It  is  worth  noting  that  with  the  assumption  that  the 
network  is  connected,  as  used  in  say  [7],  the  number  of 
hops  between  two  nodes  increases  as  the  transmission 
range  (hence  average  node  degree)  decreases.  In  terms 
of  energy  consumption,  the  assumption  of  a  connected 
network  results  in  a  misleading  conclusion  that  a  smaller 
transmission  range  is  always  better.  This  conclusion  is 
misleading  because  the  probability  of  having  a  multi-hop 
path  between  two  nodes  decreases  as  the  transmission 
range  decreases  and  this  important  fact  was  not  consid¬ 
ered.  Consequently,  this  conclusion  is  in  sharp  contrast 
with  the  result  obtained  in  this  paper  considering  the 
possibility  of  disconnected  networks  that  there  exists  an 
optimum  transmission  range  that  minimizes  the  energy 
consumption,  as  shown  in  Fig.  7.  Our  analysis  does  not 
rely  on  the  assumption  that  the  network  is  connected. 

Let's  consider  a  network  with  a  total  of  N  distinct 
source  and  destination  pairs,  where  each  source  is  sep¬ 
arated  from  the  associated  destination  by  Euclidean 
distance  xq.  Each  source  transmits  a  packet  of  unit  size 
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to  the  associated  destination.  Therefore  there  are  a  total 
of  N  packets  transmitted.  Assume  M  (M  <  N )  packets 
can  reach  their  respective  destinations  successfully. 

Define  Engeff(ro\xo)  to  be  the  effective  energy  con¬ 
sumption  per  successfully  transmitted  packet  for  any 
pair  of  nodes  separated  by  Euclidean  distance  Xq,  viz 
Engeff(r0\xo)  is  the  total  energy  spent  on  transmitting 
all  packets  divided  by  the  number  of  successfully  re¬ 
ceived  packets: 

Engeff(r0\x0)  (3) 

MEng{r0)Es[ks\x0\  +  (N  -  M)Eng{r0)Eu[ku\x0\ 
M 

„  ,  ,<j>s(x0)Es[ks\x0}  + (l  -  <j>s{xo))Eu[ku\x0] 

=  Eng{r0) - — — r - 

(®o) 

where  cj)s(xo)  ■=  M/N  is  the  probability  of  successful 
transmission  between  any  pair  of  nodes  separated  by  Xq  . 

In  a  network  where  the  transmission  range  without 
shadowing  and  fading  is  ro,  given  the  distribution  of 
the  Euclidean  distance  between  any  pair  of  nodes  f(x o), 
examples  of  which  are  given  in  [30],  the  average  effective 
energy  consumption  is: 

Engeff(r0)  =  j  Engeff(r0\xo)f{x0)dx0  (4) 

The  effective  energy  consumption  is  a  measure  of  the 
energy  spent  on  each  successfully  transmitted  packet.  A 
lower  Engeff  means  a  higher  energy  efficiency.  We  use 
Engef  f  as  the  metric  to  investigate  the  energy  efficiency 
in  end-to-end  packet  transmissions. 

3  Analysis  in  the  unit  disk  model 

In  this  section,  we  analyze  the  hop  count  statistics  (in 
particular  the  probability  l>r(A:|.x))  and  the  effective  en¬ 
ergy  consumption  under  the  unit  disk  model.  We  start 
with  the  calculations  of  the  probability  that  two  arbitrary 
nodes  are  k  hops  apart  for  k  >  3  using  GF.  The  analysis 
for  k  =  1,  2  is  straightforward. 

3.1  Spatial  dependence  problem 

Before  going  into  the  analysis,  we  introduce  the  spatial 
dependence  problem  in  the  analysis  of  hop  count  statis¬ 
tics  using  the  unit  disk  model  as  an  example.  The  same 
problem  also  exists  in  other  models.  Generally,  there  are 
two  types  of  spatial  dependence  problems. 

First,  it  can  be  shown  that  the  event  that  a  randomly 
chosen  node  is  a  kth  hop  node  (Sk)  from  a  randomly 
chosen  source  node  ( S )  is  not  independent  of  the  event 
that  another  randomly  chosen  node  is  a  ith  hop  node 
for  1  <  i  <  k.  Denote  by  C(Sk,r )  the  disk  centered  at 
Sk  with  radius  r.  As  shown  in  the  example  in  Fig.  1(a), 
the  fact  that  Sk- i  is  a  k  1th  hop  node  from  a  source 
node  S  (not  shown  in  the  figure)  implies  that  there  is 
at  least  one  node  in  the  area  C(Sk~3,i'o)  n  C{Sk- i,?’o). 
On  the  other  hand  Sk  is  a  kth  hop  node  from  S  implies 
that  there  is  no  node  in  the  area  C(Sk- 3,  ro)  (T  C(Sk,  ro). 


(a)  Dependence  problem  1  (b)  Dependence  problem  2 

Fig.  1:  Illustration  of  the  spatial  dependence  problems  in  the 
hop  count  statistics  using  a  unit  disk  model.  Sk  is  the  kth  hop 
node,  where  ro  is  the  transmission  range. 


otherwise  Sk  will  become  a  k  —  2th  hop  node.  These  two 
areas  overlap  which  means  that  the  event  that  Sk  is  a 
kth  hop  node  and  the  event  that  Sk- 1  is  a  k  1th  hop 
node  are  not  independent. 

Second,  it  can  be  shown  that  the  event  that  a  randomly 
chosen  node  is  a  kth  hop  node  from  S  is  not  independent 
of  the  event  that  another  randomly  chosen  node  is  a 
kth  hop  node  from  S.  As  shown  in  the  example  in  Fig. 
1(b),  Sk  is  a  kth  hop  node  from  S  implies  that  there  is 
at  least  one  node  (the  k  —  1th  hop  node)  in  the  area 
A2  =  C(Sk~ 2, ro)  n  C(Sk,r0).  Another  node  S')  is  a 
kth  hop  node  from  the  same  S  implies  that  there  is 
at  least  one  node  (the  k  —  1th  hop  node)  in  the  area 
A2  =  C(Sk- 2,ro)  D  C'(S),r0).  These  two  areas  overlap 
which  means  that  the  event  that  Sk  is  a  kth  hop  node  and 
the  event  that  .S')'  is  a  kth  hop  node  are  not  independent. 
In  this  paper,  a  significant  improvement  on  the  accuracy 
of  Pr(fc|a;)  is  shown  by  reducing  the  inaccuracy  associ¬ 
ated  with  the  first  type  of  spatial  dependence  problem. 
The  second  type  of  spatial  dependence  problem  can 
be  handled  by  a  similar  technique  used  in  this  paper. 
Specifically,  when  considering  the  area  covered  by  the 
transmission  range  of  a  kth  hop  node,  we  need  to 
consider  the  overlapping  of  the  area  covered  by  the 
transmission  range  of  the  kth  hop  node  and  the  area 
covered  by  the  transmission  range  of  other  kth  hop 
nodes.  However,  it  can  be  seen  in  Section  7  that  the 
result  is  fairly  accurate  after  a  proper  handling  of 
the  first  type  of  spatial  dependence  problem,  so  that 
specific  handling  of  this  second  spatial  dependence 
problem  is  effectively  not  warranted. 


3.2  Distribution  of  the  remaining  distance 

Denote  by  A( x,  n,  r2)  the  intersectional  area  of  two  disks 
with  distance  x  between  centers  and  radii  ry  and  r2 
respectively.  The  size  of  the  area  is  [31]: 


5 


Fig.  2:  Possible  positions  for  the  node  at  the  kth  hop,  denoted 
by  Sk,  are  located  on  the  arc,  considering  the  positions  of  Sk-i 
and  Sk- 2,  which  are  the  nodes  at  the  k  —  1th  hop  and  k  —  2th 
hop  from  the  source  respectively.  A\,  A 2,  Xk,  Xk-i  and  Xk- 2 
are  described  in  the  following  text. 


Define  Xk  to  be  the  remaining  Euclidean  distance 
between  the  kth  hop  node  (St,  )  and  the  destination  ( D). 
Define  A1  =  A(xk-i ,  r(l.  xk)  to  be  the  intersectional 
area  of  the  disks  C(Sk-i,r0)  and  C(D,xk )•  Similarly  we 
have  A2  =  A(xk-2,ro,Xk)-  Next  we  record  the  form  of 
dA('*drl'r2'> '  which  will  be  used  later.  For  |ri  —  r2|  <  x  < 
n  +  r2: 

—r2  r)S 

-^=(  —  )  +  2r2arcc°S(T)  (6) 

2/  -1  ,,dT.  1  ,dW, 

+f2  \/l  -T2  dr2^  4Vf  3r2' 
x2  +  r\  —  r2  ^  x2  +  r2  —  r\ 

2xri  ’  2  xr2 

((n  +  r2)2  -  a;2) (a;2  -  {n  -  r2)2) 

Define  f(xk,k\x0)  to  be  the  joint  pdf  of  the  remaining 
Euclidean  distance  to  the  destination  from  Sk  being  3,7, 
and  the  packet  having  been  successfully  forwarded  k 
hops,  conditioned  on  xq.  Due  to  the  spatial  dependence 
problem,  f(xk,k\xo)  depends  on  the  remaining  distances 
of  all  previous  hop  nodes,  i.e.  Xk-i,  a 'k-2,  ■  ■■,  x0.  We 
consider  no  more  than  two  previous  hops  and  the  justi¬ 
fication  is  given  in  Section  4. 

Define  g(xk,  k\xk-i,xk-2,k  —  1)  to  be  the  joint  pdf  of 
the  remaining  Euclidean  distance  to  the  destination  at 
Sk  being  x k  and  the  packet  having  been  successfully 
forwarded  k  hops,  conditioned  on  B,  where  B  is  the 
event  that  the  remaining  distances  at  Sk- 1  and  Sk- 2 
are  Xk-i  and  Xk-2  respectively  and  the  packet  has  been 
successfully  forwarded  k  —  1  hops.  (Note  that  a  packet 
has  been  successfully  forwarded  k  —  1  hops  necessarily 
means  that  it  has  been  successfully  forwarded  i  hops 
for  i  <  k  —  1.)  Accordingly  define  the  cdf  (cumulative 
distribution  function)  of  the  remaining  distance  at  the  kth 
hop  node  to  be  Pr(Xfc  <  Xk,  k\xk-i,Xk-2,k—l).  Ignoring 
the  boundary  effect,  whose  impact  will  be  discussed  in 
detail  later,  the  cdf  is  equal  to  the  probability  that  there 
is  at  least  one  node  in  area  A\  \  A2  as  indicated  by  the 


dA(x,  ri,  r2) 
dr2 

where  S  = 
W  = 


uniform-shaded  area  in  Fig.  2.  The  area  A2  needs  to  be 
excluded  because  if  there  is  a  node  in  this  area,  that  node 
will  be  closer  to  the  destination  than  Sk- 1,  which  violates 
the  condition  that  Sk- 1  is  the  k  —  1th  hop  node  using 
GF.  We  approximate  the  size  of  A 1  \  A  2  by  A 1  —  A  2 . 
This  approximation  will  greatly  simplify  the  calculation 
while  giving  a  sufficiently  accurate  result,  as  validated  in 
Section  6.  Due  to  space  limitation,  we  omitted  analytical 
studies  on  the  accuracy  of  the  approximation.  Then: 

Pr(Xk  <  Xk,k\xk-i,xk-2,k  -  1)  (7) 

=  1  -  exp(—p(A(xk-i,r 0,  xk )  -  A(xk-2,  r0,  xk))) 

For  any  two  nodes  close  to  the  border,  the  inter¬ 
sectional  area  of  the  transmission  ranges  of  the  two 
nodes  may  be  partially  located  outside  the  network  area, 
which  causes  an  error  in  computing  the  size  of  the  area 
Ai\A2  in  Eq.  7.  This  effect  is  due  to  the  boundary  effect. 
Ignoring  the  boundary  effect  may  generally  cause  an 
overestimation  on  the  size  of  Ai\A2r  hence  an  overesti¬ 
mation  on  the  probability  of  finding  the  next  hop  node. 
However,  simulation  results  in  Section  6  show  that  the 
boundary  effect  has  very  limited  impact  on  the  accuracy 
of  the  analytical  results. 

Taking  the  derivative  of  the  cdf  with  respect  to  xk,  we 
have: 

g(xk,k\xk-i,Xk-2,k  -  1)  (8) 

_  dPrpffc  <  Xk,  k\xk-i,  Xk-2,  k  -  1) 
dxk 

dA(xk-i,r0,xk)  _  dA(xk-2,r0,Xk) 
dxk  dxk 

x  exp(-p(A(xk-i,r0,Xk)  -  A(xk-2,r0,xk))) 

where  the  partial  differentiations  are  given  by  Eq.  6. 

Define  h(xk,Xk~i,k\xo)  to  be  the  joint  pdf  of  the 
remaining  Euclidean  distances  at  the  kth  hop  node  and 
k  —  1th  hop  node  being  xk  and  a;fc-i  respectively  and 
the  packet  having  been  successfully  forwarded  k  hops, 
conditioned  on  Xq. 

For  k  =  1,  it  is  straightforward  that: 

f(Xl,k  =  l|a;o)  =  pdA^ro’Xl\-PA^o,ro^i)  (9) 

OX  1 

For  convenience,  f(xi,k  =  i\x^)  is  denoted  by 

f(x\,i\xo)  hereafter.  Based  on  the  above  result,  for  k  =  2 
we  have: 


h(x2,x  1, 2|a:0)  =  g(x2, 2|ari,  a;0,  l)/(a;i,  l|a;o)  (10) 

For  k  >  2,  h(xk,  Xk-i,  k\xo)  can  be  calculated  recur¬ 
sively: 

rx  0 

h(xk,xk-i,k\x0)  =  /  g(xk,k\xk-i,Xk-2,k  —  1) 

Jr0 

h(xk-i,xk-2,  k  -  l\x0 )dxk-2  (11) 


Finally  for  k  >  1  we  have: 


f(xk,k\x0) 


h(xk,Xk-i,k\x0)dxk-i 


(12) 
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3.3  Hop  count  statistics 

Define  Pr(fc|xo)  to  be  the  probability  that  the  destination 
can  be  reached  at  the  kth  hop  conditioned  on  xq.  The 
destination  can  be  reached  at  the  kth  hop  if  the  k  —  1th 
hop  node  is  within  the  transmission  range  of  the  desti¬ 
nation.  Therefore: 

rr0 

Pr(k\x0)=  f(xk-i,k  -  l\x0)dxk-i  (13) 
do 

3.4  Results  for  successful  transmissions 

Denote  by  Prs(fcs|xo)  the  conditional  probability  that  a 
packet  can  reach  its  destination  at  the  k*sh  hop,  condi¬ 
tioned  on  xq  and  the  transmission  being  successful.  It 
follows  that  Pr(/cs|xo)  =  Pi's(ks\xo)(j)s(xo)/  and: 

oo  oo 

Y  ksPr(ks\x0)  =  4>s( x0)  Y  ksPi's{ks\x0) 

ks  —  l  ks  =  1 

=  4>s(x0)Es[ks\x0]  (14) 

where  <fis(xo)  and  Es [ks | xq\  are  defined  in  Section  III.D. 
In  reality,  an  upper  bound  on  ks  can  be  found  beyond 
which  Pr(A:s|a;o)  is  0.  So  Eq.  14  and  other  similar  equa¬ 
tions  only  need  to  be  computed  for  a  finite  range  of  ks . 

An  end-to-end  packet  transmission  is  successful  if  a 
packet  can  reach  the  destination  at  any  number  of  hops. 
Therefore: 

OO 

4>s{ So)  =  Y  Pr(fcsko)  (15) 

ks= 1 


3.5  Results  for  unsuccessful  transmissions 


Define  <j)u{k\xo)  to  be  the  probability  of  a  packet  having 
been  successfully  forwarded  k  hops  from  the  source 
toward  the  destination  Xq  apart,  but  not  reaching  the 
destination  in  k  hops,  which  distinguishes  (f>u(k |xo)  from 
Pr(fc|xo).  Therefore: 

rx  o 

<l>u(k\x  o)  =  /  f(xk,k\x0)dxk  (16) 

Jo 

Based  on  the  example  introduced  in  Section  III.D,  we 
further  assume  that  only  M k  out  of  N  packets  reach 
the  kth  hop  nodes.  Then  <j>u(k\xo)  =  Mk/N.  At  the  next 
hop,  there  are  three  possibilities  for  each  of  these  Mk 
packets:  1)  a  packet  reaches  the  destination  at  the  next 
hop;  2)  a  packet  makes  another  hop  without  reaching 
the  destination;  3)  the  packet  is  dropped.  Let  Wk+i  and 
Mk- ri  be  the  number  of  packets  for  which  the  first  and 
second  possibilities  apply. 

Define  ip(ku\xo)  to  be  the  probability  of  the  packets 
being  dropped  at  the  k^1  hop.  Then: 


t/j{ku\x0) 


Mku  —  Mku+ 1  —  Wku+ i 
N 


=  4>u(ku |a;0)  -  <t>u{ku  +  l|x0)  -  Pr(fcu  +  l|xo)  (17) 


The  average  number  of  hops  for  unsuccessful  trans¬ 
missions  between  a  source  and  a  destination  separated 
xq  apart  is  the  expected  value  of  ku  whose  pdf  is  given 


by  ip(ku\x o).  Similar  to  the  way  to  derive  Eq.  14,  we  have: 
EZ=i  kuil>{ku\x 0)  =  (1  -  <i>s{xo))Eu[ku\xo]. 

Given  the  above  analysis,  the  effective  energy  con¬ 
sumption  can  be  computed  using  Eq.  4,  which  is  shown 
in  Section  6.  Further,  the  above  results  can  also  be 
useful  in  the  analysis  of  delay,  throughput  or  reliability 
of  end-to-end  packet  transmissions  [8],  [20],  as  well  as 
localization  [8],  [32],  which  is  left  as  our  future  work. 

4  Impact  of  spatial  dependence  problem 

In  this  paper,  we  considered  that  the  remaining  distance 
at  the  kth  hop  node  (Sk)  depends  on  the  remaining 
distance  at  previous  two  hops  nodes  (Sk- 1  and  Sk- 2)- 
Due  to  the  spatial  dependence  problem,  it  can  be  shown 
that  correct  analysis  of  the  hop  count  statistics  requires 
all  previous  hops  to  be  considered,  but  the  calculation  is 
more  complicated  than  if  an  independence  assumption 
is  made.  Previous  research,  e.g.  [7],  usually  considered 
the  dependence  on  only  previous  one  hop.  In  this  section 
we  study  the  impact  of  the  spatial  dependence  problem 
on  the  accuracy  of  the  Pr(/c|x). 

Define  Am  to  be  the  intersectional  area  of  the  disk 
centered  at  Sk-m  with  radius  ?’o  and  the  disk  centered 
at  D  with  radius  xk-  Therefore  the  precise  area  that 
should  be  considered  in  the  calculation  of  Eq.  7  is 
A  =  A\  \  (A2  Ui3U  ...  U  Ak)  instead  of  A3  \  A2. 

Consider  only  previous  one  hop,  then:  A  ss  A\.  Con¬ 
sider  only  previous  two  hops,  then:  A  «  A\  \  A2  = 
Ai—Ai  IT  A2.  Consider  only  previous  three  hops,  then: 

A  «  A\  \  (A2  U  A3)  =  A1-A1n  (A2  U  A3) 

=  A1-A1n  a2~a i  n  a3  +  Ai  n  a2  n  a3  (18) 

The  underlined  terms  are  the  additional  terms  intro¬ 
duced  when  considering  one  more  previous  hop.  In 
considering  the  previous  m  hops  instead  of  previous 
m  —  1  hops,  the  improvement  is  bounded  by  a  term 
determined  by  A\  D  Am.  Furthermore,  it  is  evident  that 
Xk-i  <  Xk- 2  <  •••  <  xq.  Therefore  A3  >  A2  >  ...  >  Ak 
and  the  size  of  A\  D  Am  is  dominated  by  the  size  of  Arn . 

Define  h(xk,Xk-m,k\xo)  to  be  the  joint  pdf  of  the 
remaining  Euclidean  distances  at  the  kth  hop  node  and 
k  —  mth  hop  node  being  xk  and  Xk~m  respectively  and 
the  packet  having  been  successfully  forwarded  k  hops, 
conditioned  on  x3.  Then  the  expected  size  of  Am  at  the 
kth  hop  can  be  calculated  by: 

f-x  0  rxk+r0 

t^[Am,A:|xo]  —  /  /  A(xk—mii Oi  Xk) 

do  J Xk 

h(xk,xk  — mi  k\x0)dxk  —m  dxk  (19) 

For  m  =  1,  h(xk,xk-i,k\xo)  can  be  calculated  using 
Eq.  11.  For  m  =  2,  we  have: 

px  0 

h{xk,xk-2,  k\x0)  =  /  g(xk,k\xk-i,Xk-2,k  -  1) 

dr0 

h(xk-i,xk-2,k  -  l\x0)dxk-i  (20) 

For  m  >  3,  the  calculation  becomes  more 

complicated.  But  approximately  h(xk,  Xk-m,  k\xo)  « 
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f(xk,  k\x0)f(xk-m,  k  -  m\x0),  where  f(xk,  k\x0)  is  given 
by  Eq.  12.  This  approximation  is  valid  because  the 
distance  between  Sk  and  Sk-m  generally  increases  as 
to  increases,  hence  the  size  of  the  overlapping  area 
decreases.  Therefore  the  correlation  between  f(xk,  k\xo) 
and  f(xk-m,  k  —  m\xo)  reduces  as  to  increases. 


Fig.  3:  Simulation  (Sim)  and  analytical  (Ana)  results  for  the 
normalized  average  intersectional  area  size  in  the  unit  disk 
model.  Am  is  the  intersectional  area  of  the  disk  centered  at 
Sk-m  with  radius  r0  and  the  disk  centered  at  D  with  radius  Xk- 

Based  on  the  approach  introduced  above.  Fig.  3  shows 
the  results  for  the  average  size  of  Am  when  the  source 
node  and  the  destination  node  are  separated  by  distance 
xo  =  1  Or'o .  The  simulation  parameters  are  introduced  in 
Section  6.  It  is  evident  that  the  size  of  Arn,  m  >  3,  are 
negligibly  small  (less  than  1%  of  the  size  of  the  area 
covered  by  the  transmission  range)  compared  to  the  size 
of  A\  and  A2.  It  validates  the  claim  that  the  improvement 
made  by  taking  previous  to  hops  into  consideration  will 
be  marginal  for  to  >  3,  which  explains  our  choice  of 
considering  two  previous  hops  only. 

Our  results  suggest  that  the  accuracy  of  the  analysis 
on  Pr(fc|a;)  can  be  significantly  improved  by  considering 
previous  two  hops  (compared  to  considering  previous 
one  hop  only).  However,  moving  beyond  two  hops 
results  in  marginal  improvement  in  accuracy  of  the 
analysis.  Therefore,  the  conclusion  can  be  drawn  that 
the  locations  of  nodes  three  or  more  hops  away  provide 
little  information  for  a  node  to  determine  its  geometric 
relationship  with  other  nodes.  This  conclusion  provides 
analytical  support  for  observations,  to  this  point  unsup¬ 
ported  by  analysis,  in  routing,  localization  and  network 
security  that  taking  into  account  the  (location  or  link  sta¬ 
tus)  information  of  two-hops  neighbors  can  significantly 
improve  the  routing  [33]  (respectively  localization  [34], 
network  security  [35])  performance  compared  with  using 
one-hop  neighborhood  information  only.  However  be¬ 
yond  two  hops,  taking  into  account  more  neighborhood 
information  only  has  marginal  impact.  Therefore  many 
distributed  routing,  localization  and  network  security 
protocols  use  two-hop  neighborhood  information. 

5  Analysis  in  the  log-normal-nakagami 

MODEL 

The  technique  to  incorporate  the  impacts  of  both  shad¬ 
owing  and  small-scale  fading  is  through  the  use  of  the 
random  split  property  of  a  Poisson  process. 


5.1  Random  split  of  a  Poisson  process 

First,  we  introduce  a  random  variable  named  the  Nak- 
agami  fades  1  >o  which  follows  the  Gamma  distribution 
with  mean  1.  Therefore  the  pdf  of  f20  is 

TOmWm_1 

Co(wo)  =  ,  °  exp(-TOWo),  w0  >  0  (21) 

1  (to) 

where  to  is  introduced  in  Section  2.1. 

It  can  be  shown  that  the  random  variable  Pi(x)flo 
follows  the  Gamma  distribution  with  mean  Pi  ( x ),  where 
Pi(x)  =  CPtx~v  10z/10  is  the  RSS  given  by  the  log¬ 
normal  shadowing  model  introduced  in  Section  2.1. 
Then  in  the  Log-normal-Nakagami  model,  the  RSS  at  a 
receiver  at  distance  x  from  the  transmitter  is  /,v(.t)  = 
Pi(x) fio  =  CPtx~v  10z/lofio,  where  Z  is  a  zero-mean 
Gaussian  distributed  random  variable  and  flo  is  a 
Gamma  distributed  random  variable  with  mean  1. 

According  to  the  random  split  property  of  a  Pois¬ 
son  process  [36],  the  subset  of  nodes  whose  RSS  from 
a  particular  transmitting  node  with  shadowing  fades 
Z  e  [z.  z  +  dz]  and  Nakagami  fades  n0  €  [wo,wo  +  dco o] 
are  i.i.d.  following  a  Poisson  process  with  intensity 
pq(z)dz(o(uJo)du}o.  Via  the  splitting  of  the  Poisson  pro¬ 
cess,  we  can  study  the  sub-process  by  the  same  technique 
used  in  the  unit  disk  model. 

Remark  1:  The  aforementioned  technique  can  be  ex¬ 
tended  to  other  communication  models  (e.g.  the  class  of 
random  connection  models  [37]).  In  a  random  connection 
model  [37],  two  arbitrary  nodes  separated  by  Euclidean 
distance  x  are  directly  connected  with  probability  7(2), 
where  7(2)  satisfies  two  conditions:  1)  the  probability 
is  a  non-increasing  function  mapping  from  the  positive 
real  numbers  into  [0, 1];  2)  the  event  that  a  pair  of  nodes 
are  directly  connected  is  independent  of  the  event  that 
another  pair  of  nodes  are  directly  connected. 

Define  Pr/(fc|2o)  to  be  the  probability  that  two  ar¬ 
bitrary  nodes  separated  by  Euclidean  distance  20  are 
k  hops  apart  using  GF  in  the  Log-normal-Nakagami 
model.  We  start  with  k=  1. 

5.2  Probability  of  direct  connection 

Under  the  Log-normal-Nakagami  model,  two  nodes  sep¬ 
arated  by  distance  2  are  directly  connected  iff  the  RSS 
exceeds  a  given  threshold  Pmin.  Without  shadowing  and 
small-scale  fading,  the  model  reduces  to  the  unit  disk 
model  where  Pm;n  =  C  Ptrf v .  With  shadowing  and 
fading,  we  have: 

Pr(P/v(2)  >  Pmin)  =  Pr(CPt2-nOz/10ft0  >  CPtrf n) 

=  Pr(Z>  1077log10(— ^))  (22) 

roU0 

=  Pr(2  <  r0^o  exp]^1^10))  (23) 

Thus  two  nodes  are  directly  connected  if  either  of 
the  following  two  conditions  is  satisfied:  1)  Given  the 
distance  2  and  Nakagami  fades  value  Wo,  two  nodes 
are  directly  connected  iff  the  (random)  shadowing  fades 


Z  >  10r;  log10( — T7^).  2)  Given  that  the  shadowing  fades 

ro“o 

2  and  Nakagami  fades  value  wo,  two  nodes  are  directly 
connected  iff  their  distance  x  <  roUj^v  exp("j"^°). 

Based  on  the  first  condition,  the  probability  of  having  a 
direct  connection  between  two  arbitrary  nodes  separated 
by  xq  is: 


/•OO  f-OO 

Pr  i{k  =  l\x0)  =  / 

Jo  J 11 


-  I  1  —  erf (- 


!0 V  logio( — \p, 

ro"0 

IO77  log10( — ; 


q(z)dz(0(u0)duj0  (24) 


1/^7  / 


■)  Co(wo)dwo  (25) 


where  er/(.)  is  the  error  function. 

Remark  2:  Without  small-scale  fading,  viz.  considering 
the  log-normal  shadowing  model  only,  the  probability  of 
having  a  direct  connection  between  two  arbitrary  nodes 

separated  by  xo  is:  |(1  —  erf(  °V  Similarly,  the 

following  analysis  can  be  reduced  to  the  analysis  without 
small-scale  fading  by  simply  removing  the  integral  with 
respect  to  ojq. 

In  order  to  derive  Prj(/c|xo)  for  k  >  1,  we  use  the 
second  condition  to  study  the  probability  of  a  direct 
connection.  Define  rN{zg,ojg)  to  be  the  transmission 
range  of  a  transmitter  ( S )  conditioned  on  the  shadowing 
fades  and  Nakagami  fades  being  zg  and  uig  respectively. 
Then: 

tn{zs ,  us)  =  r0c4/T?  exp(TS-^10 )  (26) 

Therefore  any  node,  whose  RSS  from  the  transmitter 
( S )  has  shadowing  fades  Zg  £  [ zg,zg  +  dzg]  and  Nak¬ 
agami  fades  f lg  £  [oj g,  lo g  +dwg],  is  directly  connected  to 
S  iff  its  Euclidean  distance  to  the  transmitter  is  smaller 
than  or  equal  to  r\\r(zg,  LOg).  This  allows  us  to  apply  the 
analysis  used  in  the  unit  disk  model. 


5.3  Distribution  of  the  remaining  distance 


Fig.  4:  Possible  positions  for  the  kth  hop  node  ( Sk ),  are  located 
on  the  arc.  Consider  the  nodes  whose  RSS  from  Sk- i  has  fades 
Z\  £  [zi,  zi  +  dz\\  and  fh  £  [cji,  wi  +  dcu i];  while  its  RSS  from 
Sk- 2  has  fades  Z2  £  [z 2,  22  +  dz2\  and  ST2  £  [^>2, 0J2  +  du> 2].  The 
dashed-line  circles  represent  the  transmission  range  of  Sk- 1 
(resp.  Sk- 2)  conditioned  on  the  above  values  of  shadowing  and 
Nakagami  fades.  A\  and  A2  are  described  in  the  following. 


Define  area  size  Ai  =  A{xk-\,rN{zi,uj\),Xk)  and 
A2  =  A(xk-2,rN{z2,uJ2),xk)/  where  A(x,ri,r2)  and 

xk  are  defined  in  Section  3.  Define  fi{xk,k \xq), 
gi(xk,k\xk-i,Xk-2,k  -  1),  the  event  Bi,  Prz(A'fc  < 
xkl  k\xk-i,  xk-2,  k-1)  and  hi(xk,  xk-i,  k\x0)  analogously 
as  in  Section  3  and  use  the  subscript  l  to  mark  the  cor¬ 
responding  probabilities  in  the  Log-normal-Nakagami 
model.  We  will  derive  Pr i(Xk  <  xk,  k\xk-i,  xk~2,  k  —  1) 
by  studying  the  following  two  events.  Denote  by  C  the 
event  that  there  is  at  least  one  node  whose  Euclidean 
distance  to  the  destination  is  smaller  than  xk  and  has  a 
direct  connection  to  Sk-i  and  has  no  direct  connection 
to  Sk-m  for  to  £  [2,  where  ,S'o  is  the  source  node. 
Denote  by  V  the  event  that  the  node  Sk- 1  is  not  directly 
connected  to  the  destination.  Events  C  and  T>  are  inde¬ 
pendent  because  of  the  independence  of  the  shadowing 
and  Nakagami  fades.  It  is  evident  that: 


Pp(A"fc  <  Xk,  k\xk-i,Xk-2,  k  —  1)  =Pr(C|^)  x  Pr(2?|^;) 

(27) 

We  start  with  the  analysis  of  event  C.  In  this  paragraph 
we  only  consider  the  subset  of  nodes  whose  RSS  from 
Sk- 1  has  fades  Z\  £  [z\,  Z\  +dz\]  and  fti  £  [wi,  +duji]; 
while  its  RSS  from  Sk~ 2  has  fades  Z2  £  [225^2  +  dz2] 
and  Q2  £  [oj-2 . oj2  +  dee 2\.  Due  to  the  independence  of  the 
fades  and  the  property  of  Poisson  process,  these  nodes 
are  distributed  following  a  homogeneous  Poisson  pro¬ 
cess  with  intensity  pq(zi)q(z2)dzidz2(o(aji)(o(to2)du}iduj2. 
Denote  by  E  the  event  that  Zi  £  [zl5  zi  +  dzf\  and  Z2  £ 
[22,223-^22]  and  rh  £  [cci, ujiAdcoi]  and  El2  £  [cc2,cc2-t-dcc2]. 
Pr(C,f|S;)  is  equal  to  the  probability  that  there  is  at 
least  one  node  in  area  A\  \  A2,  as  shown  in  Fig.  4.  We 
approximate  the  size  of  area  A-\  \/l2  by  ( A 1  —  .4  2 ) + ,  where 
{A\  —  A2)+  =  max{0,  A\  —  A2}.  (Through  this  approxima¬ 
tion  we  ignored  some  rare  events  that  cause  Ai  —  A2  <  0, 
which  can  possibly  occur  when  rjy(z2,uj2)  is  much  larger 
than  In  contrast  under  the  unit  disk  model 

it  is  always  the  case  that  A\  —  A2  >  0.)  Considering 
this  subset  of  nodes  only,  1  —  Pr(C,<? \BL)  is  equal  to  1  — 
exp(-(Ai  -  A2)+pq(z1)q(z2)dz1dz2(0(u}1)C0(uj2)du}1du}2), 
which  is  the  probability  that  there  is  no  node  in  area 
Ai  \  A2.  Note  that  .4 1  depends  on  21  and  wi;  while  ,42 
depends  on  22  and  lo2. 

Then  considering  all  subset  of  nodes,  we  have: 

p 

=  1-  exp(-(Ai  -  A2)+p 

z\  oo,+oo),a;i  ,u;2E(0,+oo) 

q  (zi )  q{z2 )  dzi  dz2  Co  (<*h  )Co  (w2 )  dujx  du2  (28) 

/*oo  r-oo  /*oo  POO 

=  1  -  exp(—  /  /  /  /  (Ai  -  A2)+p 

Jo  Jo  J  —  00  J — 00 

q(zi )  q(z2 )  dzi  dz2  Co  (wi )  Co  (w2 )  dwi  dw2 )  (29) 

Since  the  event  V  only  depends  on  xk-i,  we  have: 

Pr(2?|B,)  =  l-Pr,(l|a:fc_1)  (30) 
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Then  substitute  Eq.  29  and  Eq.  30  into  Eq.  27: 

Pr i(Xk  <  xk,k\xk-i,Xk-2,k  -  1)  (31) 

pOO  p  OO  pOO  pOO 

=  (1  -  exp(-  /  /  /  /  (Ai  -  A2)+p 

J  0  Jo  J— oo  J— oo 

g(^i)g(^2)d2id^2Co(wi)Co(w2)dwidw2)) 
x  (1  Pr;(l|a;fe_i)) 

By  Leibniz  integral  rule: 


gi(xk,k\xk-i,Xk-2,k  -  1) 

<9Pr*(Xfc  <  Xk,k\xk-i,Xk-2,k  -  1) 


(32) 


dxh 


pOO  pOO  p  OO  p  oo 
JO  Jo  J —oo  J —  oo 

Co(wi)Co(w2)dwi(iw2 

poo  P  OO  P  OO  P  oo 


d(A1-A2)+ 
dxh. 


pq(z1)q(z2)dz1dz2 


p  oo  p  oo  poo  p  oo 

x  exp(—  /  /  /  /  (Ai  -  A2)+pg(3i)g(z2) 

,7  0  J 0  J —  oo  J —oo 

Co(ui)Co(u2)dzidz2duiidu>2)0  -  Pr;(l|xfc_i)) 

where  dAi/dxk  and  dA2/dxk  can  be  calculated  by  Eq.  6. 
It  is  straightforward  that  for  k  =  1,  we  have: 

(33) 

pq(zi)dziCo(ui)diJi 


fi(x i,  l|ar0) 

/•OO  /*oo 


9A(a:o,rjv(2i),a:i) 


'  0  J  —oo 

pOO  pOO 


dx\ 


poo  p  OO 

x  exp(—  /  /  A(x0,rjv(2i),a:i)/og(2:i)d^i 

J  0  J —oo 

Co(wi)dwi)(l  -  Pr;(l|x0)) 

For  k  =  2,  the  pdf  of  the  remaining  distance  of  the 
previous  hop  is  given  by  Eq.  33.  Therefore: 

hi(x  2,xi,2|x0)  =  gi(x2,2\xi,x0,l)fi{xiA\xo)  (34) 

For  k  >  2,  the  joint  pdf  of  Sfc  and  Xfe_i  is  calculated 
recursively: 

rx  o 

hi(xk,xk-i,  k\x0)  =  /  gi(xk,  k\xk~i,  Xk~2,  k  —  1) 

Jo 

hi(xk-i,xk-2,  k  -  l\xo)dxk-2  (35) 
Finally  for  k  >  2,  we  have: 

rx  o 

fi{xk,k\x0)=  /  hi(xk,xk-i,  k\x0)dxk~t  (36) 
Jo 


5.4  Hop  count  statistics 

Because  of  shadowing  and  small-scale  fading,  the  des¬ 
tination  can  be  possibly  reached  in  a  single  hop  no 
matter  how  far  the  remaining  distance  from  that  hop 
is.  Therefore  for  k  >  2: 

(■X  0 

Prz(/c|a:o)  =  /  Pii(l\xk-i)fi(xk-i,k-l\x0)dxk-i  (37) 
Jo 

Remark  3:  Based  on  the  above  results,  we  can  cal¬ 
culate  the  average  number  of  hops  for  successful  and 
unsuccessful  transmissions,  the  probability  of  successful 
transmissions  and  the  effective  energy  consumption  by 
the  same  technique  used  in  the  unit  disk  model. 


6  Simulation  results 

In  this  section,  we  report  on  simulations  to  validate  the 
accuracy  of  the  analytical  results.  The  simulations  are 
conducted  in  a  wireless  multi-hop  network  simulator 
written  in  C++.  Nodes  are  deployed  in  a  400x400  square 
following  a  homogeneous  Poisson  process  with  intensity 
p  =  0.003.  The  boundary  effect  is  included  in  the 
simulation  but  it  is  shown  to  have  a  limited  impact  on 
the  results.  The  route  between  two  nodes  is  determined 
by  the  basic  GF  algorithm.  The  transmission  range  ro 
is  varied  from  10  to  50,  which  results  in  the  average 
node  degree  varying  from  around  1  to  24.  Note  that  ro 
is  the  transmission  range  without  shadowing  and  small- 
scale  fading.  The  value  of  r0  can  be  specified  by  the 
network  designer  via  adjusting  the  transmission  power 
and  receiver  gain.  The  existence  of  a  direct  wireless 
link  between  an  arbitrary  pair  of  nodes  will  be  further 
affected  by  shadowing  and  small-scale  fading.  Several 
values  of  the  standard  deviation  in  log-normal  shadow¬ 
ing  model  have  been  used  in  our  simulations,  but  only 
the  results  for  a  =  4  are  shown  in  this  paper  because 
other  results  show  a  similar  trend.  Further,  we  only 
include  the  results  for  C2  =  0.01  and  Engc  =  0.02  (in 
Eq.  2)  as  an  example  and  the  value  of  Engc  is  found 
to  have  very  limited  impact  on  the  results.  In  order  to 
distinguish  the  impact  on  the  network  performance  of 
different  parameters,  the  packet  error  rate  is  not  included 
(i.e.  set  a  =  0)  except  Fig.  8  and  the  small-scale  fading 
is  not  included  expect  Fig.  6  and  Fig.  7(b).  Every  point 
shown  in  the  simulation  result  is  the  average  value  from 
3000  simulations.  As  the  number  of  instances  of  random 
networks  used  in  the  simulation  is  large,  the  confidence 
interval  is  too  small  to  be  distinguishable  and  hence  is 
ignored  in  the  following  plots. 

6.1  Hop  count  statistics 

Fig.  5  shows  the  probability  that  two  arbitrary  nodes 
separated  by  distance  xq  are  k  hops  apart  using  GF  in  the 
unit  disk  model  and  the  log-normal  shadowing  model 
respectively.  In  the  log-normal  shadowing  model  [18], 
the  received  signal  strength  (RSS)  attenuation  (in  dB) 
follows  a  normal  distribution  with  standard  deviation 
a  around  the  mean  value.  The  mean  value  is  given  by 
the  RSS  under  the  path  loss  attenuation  model,  which  is 
the  model  adopted  in  the  unit  disk  model  to  determine 
the  transmission  range.  Therefore  when  cr  =  0,  the  log¬ 
normal  shadowing  model  reduces  to  the  unit  disk  model. 
As  shown  in  Fig.  5,  Dep2-unit  and  Dep2-log  completely 
agree  and  the  analytical  results  have  a  good  match  with 
the  simulation  results,  which  verifies  the  accuracy  of  our 
analysis  in  both  the  unit  disk  model  and  the  log-normal 
shadowing  model. 

In  addition,  we  can  see  that  the  accuracy  is  signifi¬ 
cantly  improved  by  considering  two  previous  hops  (the 
result  from  this  paper)  compared  with  previous  analysis 
considering  only  one  previous  hop  (e.g.  [23]).  Further,  it 
can  be  seen  in  Fig.  5  that  the  improvement  of  accuracy 
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will  be  marginal  if  more  than  two  previous  hops  are 
considered,  which  also  confirms  the  analysis  in  Section 
4.  We  expect  this  observation  to  be  extended  to  many 
other  areas  (e.g.  routing,  localization,  network  security) 
and  the  approach  used  for  shedding  the  independence 
assumption  can  be  seen  in  a  broader  context.  Specifically, 
our  approach  for  shedding  the  independence  assump¬ 
tion  is  to  show  that  one  can  improve  the  accuracy  of 
Pr(fc|a:)  by  taking  into  account  the  locations  of  previous 
m-hops  nodes  ( 1  <  m  <  k  1 ) .  However  the  improve¬ 
ment  becomes  marginal  as  m  >  2.  This  suggests  that  the 
locations  of  nodes  three  or  more  hops  away  provide  little 
information  in  determining  the  geometric  relationship  of 
a  node  with  other  nodes  in  the  network.  This  observation 
further  confirms  our  assertion  in  Section  4. 

Furthermore,  it  is  interesting  to  see  that  packets  can 
be  transmitted  to  a  larger  distance  under  the  log-normal 
shadowing  model  than  under  the  unit  disk  model,  at 
the  same  number  of  hops.  This  is  because  log-normal 
shadowing  introduces  a  Gaussian  variation  of  the  trans¬ 
mission  range  around  the  mean  value,  and  with  a  higher 
chance  a  node  can  find  a  next-hop  neighbor  closer  to  the 
destination.  This  phenomenon  is  also  observed  in  the 
study  of  connectivity  [19]. 

Fig.  6  shows  the  probability  that  two  arbitrary  nodes 
separated  by  distance  Xq  are  k  hops  apart  in  the  Log- 
normal-Nakagami  model  when  the  Nakagami  parameter 
m  =  1.  Therefore,  the  corresponding  network  subjects  to 
log-normal  shadowing  and  Rayleigh  fading.  The  result 
shown  in  Fig.  6  verifies  the  accuracy  of  our  analysis. 
Further,  it  can  be  seen  that  Rayleigh  fading  reduces 
the  probability  that  two  nodes  are  connected  by  a  path 
with  k  hops.  This  can  be  explained  by  the  exponentially 
distributed  RSS  over  the  mean  value  caused  by  the 
Rayleigh  fading  which  reduces  the  probability  of  direct 
connection.  Therefore,  Rayleigh  fading  has  a  negative 
impact  on  the  network  connectivity.  A  similar  result  is 
also  observed  in  the  next  subsection. 

6.2  Effective  energy  consumption 

Fig.  7  shows  the  probability  of  successful  transmissions 
and  the  Engeff.  It  can  be  seen  that,  unsurprisingly,  the 
probability  of  successful  transmissions  increases  from 
nearly  0  to  nearly  1  as  r0  increased  from  10  to  50.  In 
contrast,  the  effective  energy  consumption  could  hardly 
have  been  predicted  by  heuristic  reasoning,  and  needs 
more  explanations. 

Take  the  results  under  the  unit  disk  model  as  example. 
When  ro  is  small,  the  network  is  made  up  of  a  large  num¬ 
ber  of  small  components.  An  increase  in  r0  will  cause  an 
increase  in  the  size  (number  of  nodes)  of  the  components 
and  also  a  reduction  in  the  number  of  components. 
Therefore  the  average  number  of  hops  for  unsuccessful 
transmission  increases,  and  the  energy  wasted  on  un¬ 
successful  transmission  also  increases.  Thus  there  is  an 
initial  increase  in  Engeff  with  the  increase  in  r0.  As  r0 
further  increases,  although  the  average  number  of  hops 


(a)  In  the  unit  disk  model,  Dep2-unit  is  the  result  in  the  unit  disk 
model  from  this  paper,  while  Dep2-log  is  the  result  in  log-normal 
shadowing  model  by  letting  a  =  0.  Dep2-log  is  indistinguishable  in 
the  plot  because  the  curve  fully  agrees  with  Dep2-unit. 


(b)  In  the  log-normal  shadowing  model 

Fig.  5:  The  probability  that  two  arbitrary  nodes  separated  by 
Euclidean  distance  xo  are  k  hops  apart.  Depl  stands  for  the 
result  calculated  by  considering  the  dependency  on  previous 
one  hop.  Dep'2  is  the  result  from  this  paper. 

for  successful/unsuccessful  transmission  still  increases, 
the  energy  wasted  on  unsuccessful  transmission  starts 
to  decrease  as  more  source-destination  pairs  become 
connected.  The  balance  of  the  two  effects  causes  Engef  / 
to  peak  at  ro  ~  19.  Above  this  transmission  range,  the  de¬ 
crease  in  wasted  energy  starts  to  dominate,  which  causes 
a  subsequent  decrease  in  Engeff.  As  r0  increases  further, 
the  average  number  of  hops  approaches  its  maximum 
and  the  energy  wasted  on  unsuccessful  transmission  also 
reduces  to  a  small  amount.  These  cause  Engef  /  to  reach 
its  minimum  at  ro  ~  31.  Above  this  transmission  range, 
most  source-destination  pairs  are  connected  as  shown 
in  Fig.  7  (a.l).  Another  effect  starts  to  dominate.  That 
is,  the  increase  in  ro  causes  the  increase  in  the  per- 
hop  energy  consumption  (like  rjj)  and  the  decrease  in 
the  number  of  hops  (approximately  like  1/ro).  The  net 
effect  is  an  increase  in  Engeff  with  the  increased  ro- 
Most  previous  studies  have  only  considered  this  last 
stage  of  the  relation  between  the  energy  consumption 
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Fig.  6:  The  probability  that  two  arbitrary  nodes  separated  by 
distance  xo  are  k  hops  apart  in  a  network  subject  to  log-normal 
shadowing  and  Rayleigh  fading. 


(a)  Without  small-scale  fading 


(b)  With  small-scale  fading 


Fig.  7:  Probability  of  successful  transmissions  (Succ)  and  ef¬ 
fective  energy  consumption  (Eng)  in  the  network.  Subfigure 
(a)  shows  the  results  without  small-scale  fading  and  (b)  shows 
the  results  with  Rayleigh  fading.  Further,  "No  Rayleigh"  is  the 
result  without  small-scale  fading  shown  in  (b)  for  comparison. 


and  the  transmission  range  and  therefore  cannot  give  a 
complete  understanding  of  the  energy  efficiency  in  end- 
to-end  packet  transmissions. 

It  is  interesting  to  note  that  the  energy  optimizing 
transmission  range  is  around  31,  which  corresponds  to 
a  network  with  most  (around  70%)  source-destination 
pairs  connected  but  not  all  of  them.  (Note  that  for 
ro  <  20,  Engeff  may  be  smaller  than  the  minimum 
Engeff.  However  at  such  value  of  ro  most  source- 
destination  pairs  are  disconnected  using  GF  and  no 
meaningful  service  can  be  provided  by  the  wireless 
multi-hop  network.)  In  order  for  more  than  99%  source- 
destination  pairs  to  be  connected,  ro  has  to  be  larger 
than  47  and  Engeff  will  increase  to  more  than  225%  of 
its  minimum  value  in  the  unit  disk  model.  A  similar 


result  can  also  be  found  in  the  log-normal  shadowing 
model  and  the  models  with  Rayleigh  fading.  Therefore 
significant  energy  savings  can  be  obtained  by  requiring 
most  nodes,  instead  of  all  nodes,  in  the  network  to 
be  connected.  This  observation  also  agrees  with  the 
analytical  results  in  [12].  In  addition,  our  result  gives 
the  amount  of  energy  that  can  be  saved.  Purely  from  an 
energy-saving  perspective  and  without  consideration  of 
other  implications,  this  interesting  result  shows  that  the 
most  energy-efficient  topology  control  algorithms  should 
be  designed  to  let  70%  (under  this  network  setting)  of 
the  source-destination  pairs  be  connected  at  the  same 
time.  The  result  sheds  insight  on  the  design  of  large 
wireless  multi-hop  networks  where  energy-efficiency  is 
a  important  issue. 

Further,  Fig.  7  (b)  shows  that  the  probability  of  suc¬ 
cessful  transmissions  is  slightly  lower  in  a  network 
with  Rayleigh  fading  compared  to  a  network  without 
Rayleigh  fading.  This  confirms  our  assertion  in  the  pre¬ 
vious  subsection. 


Poisson,  Unit  disk,  Area=4002,  rQ=10~50,  p=0.003,  r)=4 


Fig.  8:  The  effective  energy  consumption  subject  to  packet  error. 

Fig.  8  shows  the  effective  energy  consumption  with  a 
non-zero  packet  error  rate  as  shown  in  Eq.  2.  The  packet 
error  rate  increases  from  0.004  when  ro  =  10  to  0.40  when 
ro  =  50.  It  can  be  seen  that  as  the  transmission  range 
increases,  the  tail  of  the  effective  energy  consumption 
increases  faster  than  its  error-free  counterpart.  This  is 
because  an  increase  in  the  transmission  range  causes  an 
increase  in  the  number  of  neighbors  and  also  an  increase 
in  the  distance  between  the  transmitter  and  the  receiver. 
This  in  turn  increases  the  packet  error  rate  and  the 
energy  consumption.  Therefore  when  the  packet  error 
rate  is  non-zero,  the  energy  optimizing  transmission 
range  becomes  smaller  as  can  be  seen  in  Fig.  8. 

Fig.  9  shows  the  effective  energy  consumption  under 
the  log-normal  shadowing  model  with  various  values 
of  standard  deviations.  It  can  be  seen  that  a  larger 
variance  in  the  log-normal  shadowing  model  leads  to 
a  lower  energy  consumption  and  a  smaller  optimum 
transmission  range.  This  is  because  a  larger  variance 
provides  a  larger  probability  for  a  node  to  forward  the 
packet  to  a  further  node  that  is  closer  to  the  intended 
destination,  which  is  similar  to  the  observation  obtained 
in  Section  6.1. 
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Poisson,  Log-normal  shadowing,  Area=4002,  rQ=10~50,  p=0.003,  r)=4,  a=4,8,12 


Fig.  9:  The  effective  energy  consumption  under  the  log-normal 
shadowing  model  with  various  values  of  standard  deviations. 

6.3  Impact  of  node  density  and  path  loss  exponent 
on  the  optimum  transmission  range 


Poisson,  Unit  disk,  Area=4002,  r)=2~6,  p=0. 002-0. 004 


Fig.  10:  Impact  of  node  density  and  path  loss  exponent  on  the 
optimum  transmission  range. 

Fig.  10  illustrates  the  impact  of  node  density  and  path 
loss  exponent  on  the  optimum  transmission  range  in 
the  unit  disk  model.  It  can  be  seen  that  an  increase  in 
the  node  density  will  cause  a  decrease  in  the  optimum 
transmission  range.  It  is  because  an  increase  in  node 
density  without  reduction  in  transmission  range  causes 
an  increase  in  the  average  number  of  neighbors  as  well 
as  the  probability  of  successful  transmissions  between 
two  nodes.  Fig.  10  also  shows  that  a  higher  path  loss 
exponent  will  result  in  a  smaller  optimum  transmission 
range.  This  is  because  an  increase  in  the  path  loss 
exponent  will  cause  an  increase  in  the  per-hop  energy 
consumption,  as  given  by  Eq.  2.  Therefore  under  a  higher 
value  of  the  path  loss  exponent  it  is  more  energy-efficient 
to  have  smaller  components,  hence  a  smaller  optimum 
transmission  range. 

It  has  been  shown  that  the  probability  Pr(/,;jx)  and  the 
energy  consumption  are  affected  by  the  node  density 
and  path  loss  exponent.  Our  analysis  fully  captures  these 
effects  and  sheds  insight  on  the  design  of  a  wireless 
multi-hop  network. 

7  CONCLUSIONS  AND  FUTURE  WORK 

We  investigated  the  hop  count  statistics  and  the  energy 
consumed  in  the  end-to-end  packet  transmissions  in  a 
wireless  multi-hop  network.  Considering  both  shadow¬ 
ing  and  small-scale  fading,  we  obtained  analytical  results 


on  the  probability  distribution  of  the  number  of  hops 
between  two  arbitrary  nodes.  Further,  we  analyzed  the 
impact  of  the  spatial  dependence  problem  on  the  Pr(fc|x). 
Considering  the  randomness  of  node  deployment  and  a 
complex  radio  environment  which  may  result  in  discon¬ 
nected  paths  between  nodes,  we  derived  the  distribution 
of  the  number  of  hops  traversed  by  packets  before 
being  dropped  if  the  transmission  is  unsuccessful.  As  an 
application  of  the  above  results,  we  derived  the  effective 
energy  consumption  per  successfully  transmitted  packet 
in  end-to-end  packet  transmission.  We  showed  that  there 
exists  an  optimum  transmission  range  which  minimizes 
the  effective  energy  consumption.  The  research  provides 
useful  guidelines  on  the  design  of  a  multi-hop  network 
in  the  presence  of  shadowing  and  fading. 

The  hop  count  statistics  obtained  in  this  paper  will  also 
be  useful  to  determine  other  aspects  of  wireless  multi¬ 
hop  network  performance,  e.g.  end-to-end  throughput 
and  delay.  Allowing  disconnected  paths  enables  our 
results  to  be  applicable  to  sparse  network,  which  is  es¬ 
sential  to  the  study  of  partial  connectivity  [38].  Moreover, 
we  plan  to  study  the  hop  count  statistics  and  energy 
consumption  in  a  mobile  ad-hoc  network  in  the  future. 
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8  Related  work 

The  hop  count  statistics  were  first  investigated  by  Chan¬ 
dler  [13]  in  1989.  He  analyzed  the  probability  that  two 
randomly  chosen  nodes  separated  by  a  known  dis¬ 
tance  can  communicate  in  k  or  less  hops  where  nodes 
are  uniformly  distributed  over  a  plane.  However  the 
analysis  was  incomplete  as  the  aforementioned  spatial 
dependence  problem  was  incorrectly  ignored.  Ta  et  al.  [3] 
investigated  the  probability  Pr(fc|a;)  for  nodes  Poissonly 
distributed  in  a  square.  They  pointed  out  the  spatial 
dependence  problem  in  the  analysis  of  the  probabil¬ 
ity  Pr(fc|;r).  Later  in  [4]  the  same  authors  empirically 
improved  their  earlier  result  in  [3]  by  considering  the 
impact  of  boundary  effect  and  the  spatial  dependence 
problem. 

In  real  applications,  the  packets  are  forwarded  from  a 
source  to  a  destination  according  to  certain  routing  algo¬ 
rithms.  Many  routing  algorithms  (e.g.  LEACH,  AODV 
or  geographic  routing  [14])  share  the  similar  idea  of 
greedy  forwarding  (GF)  [14],  that  is,  to  forward  the 
message  to  the  node  that  is  closest  to  the  destination 
[15].  Much  research  on  the  hop  count  statistics  is  based 
on  the  distributed  routing  algorithm  GF,  however  the 
spatial  dependence  problem  were  incorrectly  ignored. 
Specifically,  two  nodes  are  k  hops  apart  if  the  path 
between  them,  using  GF,  is  k  hops.  Zorzi  et  al.  [7] 
proposed  a  GF  algorithm  for  a  network  where  nodes 
are  Poissonly  distributed  in  the  coverage  area  of  a  trans¬ 
mitting  node.  They  studied  an  upper  and  a  lower  bound 
on  the  average  number  of  hops  between  two  nodes  sep¬ 
arated  by  a  known  Euclidean  distance,  where  the  focus 
of  this  paper  is  on  a  complete  characterization  of  the 
probability  distribution  of  the  number  of  hops  between 
two  arbitrary  nodes  in  the  network.  In  [16],  Contla 
and  Stojmenovic  considered  position  based  routing 
schemes  for  a  wireless  multi-hop  network  where  nodes 
are  uniformly  distributed  in  a  square.  They  studied 
the  average  number  of  hops  between  an  arbitrary 
pair  of  source-destination  nodes.  As  pointed  out  in 
Section  1,  many  applications  need  the  knowledge  of  the 
probability  distribution  of  the  number  of  hops,  instead 
of  just  the  mean  value.  Dulman  et  al.  [6]  investigated  the 
probability  Pr(fc|.x')  by  estimating  the  expected  progress 
per  hop  using  GF.  They  considered  the  impact  of  the 
Euclidean  distance  between  neighboring  nodes  in  the 
previous  hop  on  the  progress  in  the  current  hop.  Both  [7] 
and  [6]  were  established  on  the  assumption  that  a  packet 
can  always  reach  the  destination  using  GF.  Further,  the 
aforementioned  research  only  considered  the  impact  of 
one  previous  hop  in  their  studies.  Recently,  the  accuracy 
of  the  probability  Pr(fc|x)  was  significantly  improved  in 
[17]  by  considering  the  spatial  dependence  of  two-hop 
neighbors. 

The  aforementioned  results  are  all  based  on  the  unit 
disk  communication  model,  in  which  two  nodes  are  directly 
connected  if  and  only  if  (iff)  the  Euclidean  distance  be¬ 


tween  them  is  smaller  than  or  equal  to  the  transmission 
range.  The  unit  disk  model  is  simple  but  unrealistic  [18]. 
Considering  the  log-normal  shadowing  model,  Hekmat 
and  Mieghem  [19]  showed,  through  simulations,  that  the 
probability  of  a  network  being  connected  increases  with 
increasing  value  of  the  shadowing  parameter,  which  is 
the  ratio  between  the  standard  deviation  of  shadowing 
and  the  path  loss  exponent  [18].  Mukherjee  and  Avidor 
[20]  considered  the  impact  of  the  log-normal  shadowing 
on  the  probability  Pr(fc|x)  in  a  wireless  ad  hoc  network 
where  nodes  are  Poissonly  distributed  in  a  disk,  which 
ignored  the  spatial  dependence  problem.  In  addition 
to  shadowing,  the  communication  between  two  nodes 
can  be  affected  by  the  small-scale  fading  (e.g.  Rayleigh 
fading).  From  the  connectivity  point  of  view,  Miorandi 
and  Altman  [21]  studied  the  node  isolation  probability 
in  a  network  subject  to  both  log-normal  shadowing 
and  Rayleigh  fading.  They  showed  that  Rayleigh  fad¬ 
ing  reduces  the  connectivity  probability  of  the  network. 
Moreover,  Haenggi  [22]  studied  the  routing  performance 
for  large  multi-hop  networks,  considering  the  impact  of 
Rayleigh  fading  on  the  end-to-end  delivery  probability. 
It  is  shown  that  routing  over  many  short  hops  is  not 
as  beneficial  in  a  network  subject  to  Rayleigh  fading 
as  that  for  a  network  without  Rayleigh  fading.  In  this 
paper,  we  considered  the  impact  of  both  log-normal 
shadowing  and  small-scale  fading  on  the  hop  count 
statistics.  Further,  our  analysis  takes  into  account  the 
impact  of  the  spatial  dependence  problem,  which  is  a 
major  technical  hurdle  in  the  accurate  analysis  of  the 
probability  Pr(/c|a;). 

The  analysis  on  the  hop  count  statistics  can  be  used 
in  a  number  of  areas  in  wireless  multi-hop  networks. 
This  paper  focuses  on  its  use  in  energy-efficient  oper¬ 
ations  of  wireless  multi-hop  networks  as  an  example. 
Minimizing  energy  consumption  is  one  of  the  major 
considerations  in  the  design  of  battery  powered  wireless 
multi-hop  networks.  In  many  applications  it  is  difficult 
to  change  or  re-charge  a  battery  for  the  wireless  nodes. 
From  a  designer  point  of  view,  a  popular  approach 
of  reducing  energy  consumption  is  optimally  choosing 
the  transmission  power.  In  [23],  Deng  et  al.  considered 
a  network  where  nodes  are  Poissonly  distributed  in  a 
circular  area.  They  assumed  that  there  is  always  a  path 
between  any  pair  of  nodes  using  GF.  By  analyzing  the 
average  progress  per  hop  that  a  packet  is  transmitted 
towards  the  destination,  they  obtained  analytical  results 
on  the  distance-energy  efficiency,  which  is  the  ratio  of 
the  average  progress  to  the  energy  consumed  in  a  sin¬ 
gle  transmission,  and  the  optimum  transmission  range 
that  maximizes  the  distance-energy  efficiency  for  high- 
density  networks.  Zhang  and  Gorce  [24]  considered  the 
impact  on  energy  consumption  of  unreliable  links.  They 
postulated  that  with  a  certain  probability  a  transmission 
between  two  directly  connected  nodes  is  unsuccessful, 
re-transmissions  may  then  be  required  and  energy  con¬ 
sumption  may  be  consequently  higher.  The  extra  energy 
consumed  due  to  unreliable  links  is  also  considered  in 
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this  paper. 

9  Usefulness  of  the  hop  count  statis¬ 
tics 

The  results  on  the  hop  count  statistics  hold  the  key  to 
solving  a  large  number  of  problems  in  wireless  multi¬ 
hop  networks: 

•  The  summation  of  Pr  (k)  from  k  =  1  to  k  =  oc 
provides  the  probability  that  two  randomly  chosen 
nodes  are  connected  (via  a  multi-hop  path),  which 
in  turn  can  lead  to  result  on  the  probability  of  a 
connected  network.  The  result  will  be  valid  for  not 
only  large-scale  networks  [39]  but  also  for  small- 
scale  networks  which  are  often  encountered  in  real 
applications.  So  far,  there  are  few  analytical  results 
on  the  connectivity  of  small-scale  networks. 

•  The  hop  count  statistics  are  also  useful  in  network 
capacity  analysis.  In  [40],  it  is  demonstrated  that 
the  capacity  scaling  law  of  multi-hop  networks  ob¬ 
served  in  [11]  can  be  easily  explained  by  the  increase 
in  the  average  number  of  hops  (hence  the  increase  in 
the  portion  of  bandwidth  spent  on  relaying  traffic) 
as  the  network  becomes  larger. 

•  The  probability  Pr(fc)  is  also  useful  in  estimating  the 
energy  consumption,  the  network  lifetime  and  the 
reliability  of  end-to-end  packet  transmission  [7].  As 
shown  in  this  paper,  Pr(fc)  can  be  used  to  estimate 
the  effective  energy  consumption  and  help  the  net¬ 
work  designers  to  choose  the  optimum  transmission 
range  / power  to  minimize  the  energy  consumption. 
It  can  also  be  readily  used  to  help  a  network 
designer  to  set  the  transmission  range/power  to 
provide  a  guaranteed  performance  on  the  end-to- 
end  packet  transmissions. 

•  The  probability  Pr(/c|.x')  has  been  used  in  [41]  to 
form  a  novel  approach  to  obtain  bounds  on  the 
critical  density  for  percolation  in  wireless  multi-hop 
networks,  a  well-known  open  problem  in  the  area. 
In  [38]  results  on  Pr(k\x)  are  used  as  a  main  tool 
to  study  the  partial  connectivity  of  a  wireless  multi¬ 
hop  network  with  infrastructure  support.  Besides  its 
use  in  performance  analysis,  a  protocol  designer  can 
use  results  on  Pr(fc|x)  to  help  choose  the  optimum 
protocol  parameters,  e.g.  the  timeout  parameter  TTL 
used  in  many  routing  protocols,  to  balance  band¬ 
width  (or  energy)  consumption  and  probability  of 
successful  delivery  [5]. 

•  The  probability  density  Pr(x|/c)  is  useful  in  esti¬ 
mating  the  distance  between  two  nodes  from  their 
neighborhood  information  and  obtaining  variance 
of  such  an  estimate,  which  has  in  turn  been  used 
in  forming  a  localization  algorithm  with  improved 
performance  [8],  [32]. 

•  Further,  the  technique  used  to  derive  Pr(fc|x)  can  be 
simplified  to  study  ID  networks  or  grid  networks  so 
that  the  analysis  can  be  applied  to  study  vehicular 
networks,  see  [42]  for  an  example  where  Pr(Ic|:r)  is 


used  to  derive  the  access  and  connectivity  probabil¬ 
ities  of  ID  vehicular  networks. 

10  Reliability  of  the  assumption  of  the 

INDEPENDENCE  BETWEEN  LINKS 

In  some  environments,  the  assumption  of  indepen¬ 
dence  of  connections  may  not  be  accurate  while  in 
other  environments  (e.g.  open  space)  it  is  a  reasonable 
assumption.  For  example,  it  is  generally  accepted  that 
if  a  pair  of  transmitters  are  separated  by  more  than  A/4, 
where  A  is  the  wavelength,  their  signals  at  a  common 
receiver  can  be  regarded  as  statistically  independent. 
Further  it  was  shown  [43]  that  if  a  pair  of  receivers  are 
separated  by  more  than  A,  their  received  signals  from  a 
common  transmitter  are  only  weakly  correlated  (with 
a  correlation  coefficient  less  than  0.15).  At  a  typical 
frequency  of  5GHz,  A  =  0.06m.  Thus  the  requirement 
on  the  separation  of  vehicles  can  be  easily  met.  We 
also  note  that  although  field  measurements  in  real 
applications  seem  to  indicate  that  the  connectivity 
between  different  pairs  of  geographically/frequency 
proximate  wireless  nodes  are  correlated  [44],  [45], 
the  independence  assumption  is  generally  considered 
appropriate  for  far-field  transmission  and  has  been 
widely  used  in  the  literature  under  many  channel 
models  including  the  log-normal  shadowing  model 
[18],  [20],  [25]. 
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A  Robust  Reachability  Review  for  Control  System  Security 


Adrian  N.  Bishop 


Abstract 

Control  systems  underpin  the  core  technology  in  numerous  critical  infrastructure  systems;  e.g.  the  electricity,  transportation 
and  many  defence  systems.  Increasingly,  such  systems  are  becoming  the  target  of  a  novel  type  of  deliberate  cyber-attack.  The 
notion  of  control  system  security  is  a  modern  idea  concerned  with  the  analysis,  design  and  application  of  tools  that  ensure  the 
operational  goals  of  a  particular  control  systems  are  protected  from  deliberate,  malicious,  electronic  attack.  The  aim  of  this  field  is 
to  reduce  the  likelihood  of  success,  and  the  severity  of  impact,  of  a  cyber-attack  against  control  systems  operating  within  critical 
infrastructure.  This  contribution  re-examines  a  classical  notion  of  control  reachability  through  set-theoretic  arguments  with  an 
additional,  modern,  emphasis  on  control  system  security.  In  particular,  the  reachability  idea  is  extended  to  a  compromised  control 
system  architecture.  Classical  and  novel  results  on  reachability  are  defined  within  this  setting  from  the  point-of-view  of  an  attacker 
and,  conversely,  a  system  designer  wanting  to  secure  the  system.  The  idea  of  reachability  studied  in  this  setting  for  secure  control 
is  important  in  both  the  design  of  robust  control  systems  with  security  features  and  in  assessing  the  vulnerability  of  particular 
control  systems. 


I.  Introduction 

The  problem  of  networked  estimation  and  control  has  been  considered  extensively  in  recent  years;  see  the  expositions  in  [1] — [3].  This  work 
considers  security  for  networked  estimation  and  control  systems',  or  so-called  networked  cyber-physical  systems.  There  are  many  networked 
control  systems  that  operate  critical  infrastructure  within  strategic  resource  sectors1 .  The  reliability  and  security  of  most  systems  of  this  nature 
is  critical.  However,  the  distributed  and  networked  nature  of  the  system  often  makes  it  vulnerable  to  a  deliberate  attack;  e.g.  many  distributed 
cyber-physical  systems  are  controlled  via  unsecured  communication  networks  that  may  also  be  accessible  through  the  world- wide-web  [4], 
Moreover,  there  are  many  sensitive  applications  involving  networked  control  systems,  e.g.  power  and  transportation  networks,  and  such 
applications  are  appealing  targets  for  many  malicious  entities.  A  disruption  or  disconnection  of  these  services  may  have  significant  social, 
environmental  and/or  economic  consequences.  Secure  networked  estimation  and  control  is  thus  an  important  emerging  problem2. 

The  idea  of  secure  control  and  estimation  systems  is  becoming  increasingly  important,  e.g.  as  a  result  of  significant  publicised  attacks  and 
the  realization  that  a  compromised  system  operating  in  a  key  sector  may  have  significant  social,  environmental  and/or  economic  consequences. 
Early  studies  in  [4],  [8]  outline  a  framework  for  discussing  and  investigating  the  security  of  networked  control  and  estimation  systems.  In 
[4]  two  classes  of  attack  are  characterized:  namely  denial-of-service  (DOS)  attacks  where  the  attacker  prevents  the  transmission  of,  e.g., 
measurement  or  control  signals,  and;  deceptive  attacks  where  a  false  (or  distorted)  signal  is  inserted  into  the  system  to  disrupt  the  behaviour 
of  the  estimator  or  controller. 

The  problem  of  robust  control  in  the  presence  of  DOS  attacks  has  been  investigated  in  [8].  This  work  has  similarities  with  those  networked 
control  problems  that  are  designed  to  cope  with  limited-data  rate  communications  and  packet-dropping.  However,  the  latter  work  often  assumes 
some  random  model  of  packet  failure  whereas  any  DOS  attack  may  be  far  more  systematic.  The  problem  of  secure  control  and  estimation 
under  false-data  attacks  has  been  investigated.  In  [9],  [10]  the  idea  of  reachability  is  used  to  analyze  the  security  of  control  systems  and 
their  resilience  to  attack.  In  [11]  a  necessary  and  sufficient  condition  is  provided  under  which  an  attacker  can  destabilize  a  system  nominally 
controlled  by  an  optimal  control  law  while  remaining  undetected  by  a  particular  detector. 

Alternatively,  the  problem  of  secure  estimation  and  control  in  networked  systems  under  false-data  attacks  has  attracted  significant  attention 
recently;  e.g.  see  [1 1] — [27],  At  a  high-level,  the  majority  of  this  work  addresses  the  problem  of  state  estimation  with  inherent  detection  of 
false-data  attacks  or,  conversely,  a  characterization  of  those  attacks  which  are  likely  (under  some  modelling  and  algorithmic  assumptions)  to 
remain  undetected  in  certain  state  estimation  schemes.  The  classical  field  of  fault-detection  and  fault-tolerant  control;  see  [28],  is  related  to 
this  work  also. 

This  contribution  examines  a  classical  notion  of  control  reachability  through  set-theoretic  arguments  with  a  modem  emphasis  on  control 
system  security.  In  particular,  the  reachability  idea  is  extended  to  a  compromised  control  system  architecture.  Classical  and  novel  results 
on  reachability  are  defined  within  this  setting  from  the  point-of-view  of  an  attacker  and,  conversely,  a  system  designer  wanting  to  secure 
the  system.  The  idea  of  reachability  studied  in  this  setting  for  secure  control  is  important  in  both  the  design  of  robust  control  systems  with 
security  features  and  in  assessing  the  vulnerability  of  particular  control  systems. 

II.  A  General  Introduction  on  Robust  Reachability 

The  introduction  in  this  section  follows  the  classical  results  in  [29]— [3 1  ] .  However,  the  notation  and  set-theoretic  reachability  results  are 
stated,  and  derived,  in  a  different  fashion  that  suits  the  later  framework  of  control  security. 

Consider  the  time-varying  discrete-time  system 

Xfc+1  =  /fc(xfc,ufc)  +  3fc(wfc)  (1) 

A.N.  Bishop  is  with  NICTA  and  the  Australian  National  University. 

Examples  of  so-called  networked  cyber-physical  systems  include  critical  infrastructure  such  as  transportation  networks,  power  and  electricity  grids,  water 
distribution  networks,  etc.  In  these  cases,  the  networked  components  may  be  spatially  separated. 

2 An  example  of  an  attack  on  a  real-world  networked  control  system  is  the  highly  sophisticated  Stuxnet  attack  on  the  Iranian  nuclear  energy  program  [5]. 
Stuxnet  is  a  worm  that  spreads  indiscriminately  across  computer  networks  running  Microsoft  Windows  causing  no  disruption  but  which  also  includes  a  payload 
designed  to  target  Siemens  supervisory  control  and  data  acquisition  (SCADA)  systems  configured  to  control  and  monitor  very  specific  industrial  processes 
(most  likely  certain  centrifuges).  In  particular,  Stuxnet  infects  the  programmable  logic  controllers  (PLC)  in  a  SCADA  system  by  subverting  the  application 
used  to  reprogram  these  devices.  It  then  disrupts  the  operation  of  certain,  very  specific,  industrial  processes  being  driven  by  the  controller.  Stuxnet  is  so 
sophisticated  that  the  general  consensus  is  that  it  was  state-sponsored  and  the  first  such  act  of  industrial  cyber- warfare  [5].  Other  publicised,  but  less  dramatic, 
examples  of  intrusions  and  attacks  on  networked  control  systems  exist;  e.g.  [6],  [7]. 
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defined  for  fc  G  N  U  {0}  =  No  and  where  Xfc  £  Xk  C  X  C  Rn  is  the  state,  Ufc  G  Uk  C  W  C  Rm  is  the  control  vector  and 
wfe  G  Wfe  CWCR'is  the  system  uncertainty  input  or  error.  The  functions  fk  :  Rn  x  U  — >  Rn  and  g k  :  Wfc  — >■  R"  are  known  and 
/fc  +  fffe  :  Wfe  x  {R™  x  U}  —y  Xk+i-  The  system  (1)  describes  the  physical  system  under  consideration. 

The  following  definition  makes  matters  precise. 

Definition  1.  Suppose  Xq  is  given  and  fixed.  Suppose  a  sequence  of  control  functions  are  specified  of  the  form  Ufc  =  u fc(xfe)  :  Xk  -A  Uk 
where 


Xk+i  =  {xfe+i  G  Rn  :  Xfe+i  =  /fc(xfe,  Ufc)  +  gfe(wfc), 

Vxfc  G  Xk,  Vut  G  Ut,  Vw t  €  Wt ,  Vt  G  {0, . . . ,  fc}} 

;,v  I  hen  called  the  local  set  of  admissible  state  values  under  the  sequence  {Ut}t=o  at  time  k  for  all  k  >  0. 

If  we  suppose  that  Ufc  take  any  value  in  IA  at  time  k  and  suppose  Xq  is  given  and  fixed  then 

Xk+i  =  (Xfe+i  G  Rn  :  Xfe+i  =  /fc(xfc,  Ufc)  +  gfc(wfc), 

Vxfc  G  Xfc,  Vut  G  U,  Vwt  G  Wt,  Vf  G  {0,  .  .  .  ,  fc}} 

is  called  the  global  set  of  admissible  state  values  at  time  k  for  all  fc  >  0. 

It  is  clear  that  Xk  C  Xfc.  Note  X  =  UfcAfc  C  R"  is  defined  by  a  particular  feedback  controller  whereas  U  and  W  are  defined  by  the 
physical  properties  of  the  system  as  expected.  Obviously  X  C  UfcXfc. 

The  general  control  problem  under  consideration  is  stated  as  follows. 

Problem  1.  Consider  a  time  0  <  r  G  N  and  suppose  Xq  is  given  and  fixed.  Design  a  control  function  Ufc  =  rtfc(xfc)  :  Xk  — >  Uk  mapping 
Xfc  into  Uk,  for  all  k  G  {0, . . . ,  r  —  1}  such  that  the  following  holds 

f T  —  1  +  <7r  —  1  ■  Wt  — 1  x  {XT-1  X  UT- 1}  — >  XT  (2) 

for  all  Wk  G  Wfc,  Vfc  G  {0, . . . ,  r  —  1}.  The  set  Xf  G  Rn,  0  <  r  G  N  constitutes  the  desired,  or  target,  set  at  time  t  for  the  controlled 
system  (1)  and  should  be  known  to  the  control  designer  a  priori. 

A  question  in  robust  control  then  concerns  the  solvability  of  Problem  1 .  We  introduce  the  following  definition. 

Definition  2.  A  set  X*  G  Rn,  0  <  r  G  N  is  said  to  be  reachable  given  Xq  if  there  exists,  at  least,  a  single  sequence  of  control  functions 
Uk  '■  Xk  — >  Uk,  Vfc  G  {0, . . . ,  r  —  1}  that  solves  Problem  1. 

Obviously  Xf  C  XT  is  a  necessary  condition  for  reachability,  however,  it  is  not  necessarily  true  that  every  subset  of  XT  is  a  reachable 
subset  given  some  Xq.  Admissible  is  not  equivalent  to  reachable. 

Consider  a  set 


7fc  =  {x  G  Qk  ■■  fk{ x,  Ufc)  G  £fc+i, 

for  some  Ufc  G  Uk}  (3) 

where 

£fc+i  =  {x  G  Rn  :  (x  +  fiffc(wfc))  G  Qfc+i, 

Vw  fc  G  Wfe}  (4) 

with  boundaty  conditions  QT  =  Xf  and  Qo  =  Xq.  Suppose  7o  ^  0.  Then  xo  G  7o  C  Xq  implies  xi  G  Q i.  Similarly,  if  7i  ^  0  then 
xi  G  7i  C  Qi  implies  X2  G  Q 2  etc. 

Proposition  1.  The  target  set  X*,  0  <  r  G  N  is  reachable  from  all  x0  G  Xq  if  and  only  if  Qk  C  Tk  for  all  fc  G  {0, . . . ,  r  —  1}. 

The  sets  Tk  and  £k  are,  in  principal,  computable  and  can,  in  special  cases,  lead  to  the  closed-form  design  of  control  functions  Ufc  that 
ensure  the  target  set  is  indeed  reached. 

Problem  2.  Consider  a  time  0  <  r  G  N  and  suppose  Xq  is  given  and  fixed.  Design  a  control  function  uj,  =  Ufc(xfc)  :  Xk  — >  Uk  mapping 
Xfc  into  Uk,  for  all  k  G  {0, . . . ,  r  —  1}  such  that  the  following  holds 

fk- 1  +  gfc-i  :  Wfc-i  x  {Afc_i  x  Uk- 1}  -A  Xk  (5) 

for  all  fc  G  {1, . . . ,  t}  and  Vwfc  G  Wfc.  The  sequence  of  sets  {(X£ ,  fc)}fc=1,  0  <  r  G  N  constitutes  a  desired,  or  target,  tube  in  Rn  x  N} 

for  the  controlled  system  ( 1 )  and  should  be  known  to  the  control  designer  a  priori. 

We  introduce  the  following  definition. 

Definition  3.  A  sequence  of  sets  {Xf  }fc=1,  0  <  r  G  N  is  said  to  be  reachable  given  Xq  if  there  exists,  at  least,  a  single  sequence  of  control 
functions  Uk  '■  Xk  — ¥  Uk,  Vfc  G  {0, . . . ,  r  —  1}  that  solves  Problem  2. 

In  a  different,  but  equivalent,  notation  we  have  it  that  Uk  '■  Xf  —$■  Uk  when  fcG  {1, . . . ,  r  —  1}. 

Consider  a  set 


Tf  ={xG  Xk  '■  fk{x,  Ufc)  G  ffc+i, 

for  some  u*  G  Uk} 


(6) 
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where 


£k+ 1  =  {x  £  Rn  :  (x  +  9k{ wfc))  6  Afc+r, 

Vwfc  e  wfc}  (7) 

where  we  abuse  notation  and  let  Xg  =  Xo.  Suppose  To  f=-  0.  Then  xq  £  To  C  To  implies  xi  £  Af*.  Similarly,  if  7i*  ^  0  then 
xi  £  T*  C  Xf  implies  X2  £  TTf  etc. 

Proposition  2.  77te  sequence  of  target  sets  {Xf}].-1,  0  <  r  £  N  is  reachable  from  all  xq  £  Xo  if  and  only  if  T£  0  for  all 
k  £  {0, . . . ,  t  —  1}  and  To  =  Xo. 

III.  A  Compromised  System  Architecture 

The  system  introduced  in  the  previous  section  is  depicted  in  Figure  1. 


Fig.  1.  A  nominal  robust  control  system. 

However,  suppose  now  an  attacking,  or  malicious,  agent  places  itself  within  the  system  architecture  as  shown  in  Figure  2. 


Fig.  2.  A  compromised  robust  control  system.  Note  we  will  substitute  for  in  the  text  of  this  paper  in  order  to  distinguish  the  trajectories  of  a  nominal 
system  and  the  compromised  system  for  some  fixed  sequence  of  and  w&. 


The  corresponding  compromised  time-varying  discrete-time  system  is  then 

Xfe+1  =  /fc(xfc,  Zfc)  +  gfc(wfc)  +  Vfc  (8) 

defined  for  k  £  N  U  {0}  =  No  and  where  xs,  £  Xt  C  X  C  Rn  and  Wfc  £  W4  C  W  C  R(  are  the  state  and  disturbance  vectors.  Now 

Zfc  £  Zk  C  Z  CU  C  Rm  is  the  compromised  control  vector  and  Vfc  £  14  C  V  C  Rn  is  an  explicit  attack  vector 3.  Note  that  we  suppose 

U  is  the  entire  set  of  admissible  control  inputs  to  the  system  and  thus  Z  C-U  makes  sense.  Thus,  the  functions  fk  :  Rn  x  U  — >  Rn  and 

flfc  :  Wfc  — >  R"  are  known  as  before  and 

fk  +  Sfc  +  Vfc  :  Wfc  x  {R  x  IA}  x  Vfc  — >  Tfc+i  (9) 

as  expected.  The  following  definition  is  given  for  completeness. 

3The  main  analysis  will  neglect  such  an  input.  This  is  not  unreasonable  since  in  practice  an  attacker  is  likely  to  systematically  gain  access  to  and  distort 
only  the  nominal  control  input. 
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Definition  4.  Suppose  Xo  is  given  and  fixed.  Suppose  a  sequence  of  compromised  controller  functions  are  specified  of  the  form  zk  = 
«fe(x,  Ufc)  :  Xk  x  Uk  — ^  Zk  where  =  itfc(xfc)  :  Xk  — >  Uk  is  a  sequence  of  fixed  nominal  control  vectors  and  where 

Xk+i  =  {xfc+i  £  R™  :  xfc+i  =  /fc(xfc,zfc)  +  Sfc(wfc)  +  vfc 

Vxfc  €  Xk,  Vz t  £  Zt,  Vu t  £  Ut, 

Vw,  €  m,  Vvt  €  Vt,  Vf  6  {0, . . . ,  k}} 

is  then  called  the  local  set  of  admissible  compromised  state  values  under  the  sequence  {Zt}t=o  and  {Ut}t= o  at  time  k  for  all  k  >  0. 

If  we  suppose  that  Zk  take  any  value  in  U  at  time  k  and  suppose  Xo  is  given  and  fixed  then 

Xfc+i  =  {xfe+i  £  Rn  :  Xfe+i  =  /fc(xfc,Zfe)  +  fffc(wfc)  +  vfe 

Vxfe  €  Xk,  Vz t  £  Id,  Vw t  €  Wt, 

Vvt  £  Vt,  Vt£  {0,...,fc}} 

is  called  the  global  set  of  admissible  compromised  state  values  at  time  k  for  all  k  >  0. 

The  following  result  is  given  for  completeness. 

Proposition  3.  Suppose  Xo  =  Xo-  Then  Xk  C  Xk  and  Xk  =  Xk  if  and  only  if  vt  =  0  for  all  0  <  t  <  k  —  1. 

In  general.  Xk  C  Xk  implies  the  malicious  agent  may  be  able  to  distort  the  trajectory  of  the  system  into  alternative  regions  of  the  state 
space  Rn  than  are  otherwise  admissible. 

IV.  When  Can  an  Attack  Compromise  the  Safety  of  the  System? 

In  this  section  we  will  consider  the  notion  of  a  safe  operating  region  for  a  particular  control  system. 

Definition  5.  Consider  a  sequence  of  regions  Sk  C  Xk  =  Xk  C  Rn  for  all  k.  We  call  Sk  the  safe  operating  region  at  k  for  the  nominal 
system  (1)  and  if  Xfc  ft  Sk  we  say  the  safety  of  the  system  is  compromised. 

Note  that  UkSk  C  UkXk  =  U kXk  C  Rn.  Before  we  look  at  the  damage  an  attacker  can  cause  under  some  modelling  assumptions,  we 
first  look  at  the  nominal  control  problem  again,  i.e.  Problem  2. 

Define  a  set 

Uk  =  {u  Gift:  /fc(xfc,  u)  G  £fc+i,Vx/fe  G  Xk}  (10) 

where  we  abuse  notation  and  let  XJ  =  Xo-  Recall 

t+i  =  {x  G  Rn  :  (x  +  gk( wfc))  G  Xk+1, 

Vw k  G  Wfc}  (11) 

We  then  have  the  following  result. 

Proposition  4.  The  sequence  of  target  sets  {Xk}k=i,  0  <  r  G  N  is  reachable  for  all  xo  G  Xo  ifU^  0,  Vfc. 

The  proof  is  omitted  but  note  the  condition  U^  ^  0  is  only  sufficient  and  not  necessary  to  solve  Problem  2.  In  some  sense,  the  set  U^ 
defined  above  neglects  the  control  history  u(,  for  fc  G  {0, . . . ,  fc  —  1}.  We  can  state  a  necessary  and  sufficient  condition  on  a  set  of  control 
inputs  that  ensures  Problem  2  is  solved. 

Define  the  set 

Ut  =  {u  G  Uk  ■■  /fe(xfe,  u)  G  £t+i, 

Vxfc  G  -Ffc-ifufc-i)}  (12) 

where 

-Tfe(ufc)  =  fk(Xk,  ujt)  fffc(Wfc)  (13) 

where  0  is  the  set-theoretic  Minkowski  sum.  We  also  let  T~\  =  Xq.  Using  Ut  we  can  account  for  a  particular  history  of  control  laws. 
Proposition  5.  The  sequence  of  target  sets  {Xt}t=i,  0  <  t  G  N  is  reachable  for  all  xo  G  Xo  if  and  only  ifUt  0,  Vfc. 

Obviously  the  following  result  follows  from  the  preceding  two  propositions. 

Corollary  1.  U^  C  Ut 

The  set  Ut  is,  in  principal,  computable  and  can,  in  special  cases,  lead  to  the  closed-form  design  of  control  functions  Ufc.  Indeed,  once 
Ut  is  computed,  any  G  Ut  is  sufficient.  (As  is  typical,  computation  of  these  control  input  sets  involves  a  backward  recursion). 

Now  we  examine  when  an  attack  can  compromise  the  safety  of  a  particular  system.  We  make  the  following  two  assumptions. 

Assumption  1.  Consider  the  two  sets  No  and  Ng+ 1  and  suppose  Xo  is  given.  There  exists  a  nominal  control  Ufc  =  Ufc(xfe)  :  Xk  — >  Uk, 
Vfc  G  {0, . . .  ,r  —  1}  such  that  Xk  =  Xt  C  Sk  for  all  k  G  {0, . . . , r}. 

The  previous  assumption  states  that  under  a  nominal,  i.e.  non-compromised  control,  the  system  will  operate  within  its  desired  target  tube 
and  within  its  safety  region. 

Assumption  2.  Consider  the  two  sets  No  and  Ng  and  suppose  Xo  is  given.  The  attacker  can  hijack  the  nominal  control  inputs  at  times 
k  G  {9, . . . ,  r  —  1}.  The  compromised  control  is  zk  =  «fe(x,  u*,)  :  Xk  x  Uk  — >  Zk.  Also,  vt  G  Vt  =  {0}  implying  Xk  =  Xk  for  all  k  and 
Xk  —  Xk. 
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This  assumption  implies  that  an  attacker  can  hijack  the  control  input  at  some  time  k  =  9.  Before  this  time  the  system  is  nominally 
controlled. 

Definition  6.  Consider  a  sequence  of  regions  Sk  C  It  =  It  C  R"  for  all  k.  We  say  the  safety  of  the  system  is  compromised  by  an  attack 
if  xt  ^  St  at  some  t.  and  3k  £  {0, . . . ,  t  —  1}  such  that  z k  ii/ . 


Note  if  the  safety  of  the  system  is  compromised  by  an  attack  then  necessarily  the  state  of  the  system  is  outside  the  desired  target  tube 
since  Xk  C  Sk  for  all  k.  However,  the  system  may  be  steered  outside  the  target  tube  by  an  attacker  but  remain  within  a  safe  operating 
region  for  the  system. 

Define  a  set  Ak  =  Xk  \  Sk  C  Xk  =  Xk.  This  set  Ak  is  a  pseudo-target  set  for  an  attacker  wishing  to  compromise  the  safety  of  the 
system.  For  simplicity,  we  assume  it  is  sufficient  to  consider  a  target  set  AT  =  XT  \  ST  C  XT  =  XT. 

For  k  £  { 9 , . . . ,  r  —  1},  consider  a  set 

Zl  =  {z  £  Zk  '■  /fc(xfe,z)  £  Hk+ i, 

Vxfc  €  Jfc-i(zfc_i)}  (14) 

where 


TLk+i  ={x£l":(x  +  gk{ wfc))  £  Qk+i, 

Vwfc  £  144} 


with  the  constraint  QT  =  AT  and 

J'fc(zfc)  =  fk{Xk,  zk)  ©  9k(Wk) 

as  before.  We  also  let  To- i(zg_i)  =  Jrg_i(ue_i)  and  note  that  one  could  think  of  the  relationship  Zk  =  Uk 

Theorem  1.  The  safety  of  the  system  at  time  k  =  t  can  be  compromised  by  an  attacker  initiating  an  attack  at 
only  if  Zk  ^  0/or  all  k  £  {6, . . . ,  r  —  1}. 


(15) 

(16) 

for  all  k  £  {0, . . . ,  9  —  1}. 
some  time  k  =  9  <  t  if  and 


The  set  Zk  is,  in  principal,  computable  and  can.  in  special  cases,  lead  to  the  closed-form  design  of  attack  functions  zk.  Indeed,  once  Zk 
is  computed,  any  zk  £  Zk  is  sufficient  to  drive  the  system  outside  its  safety  region  so  long  as  the  necessary  and  sufficient  condition  of  the 
preceding  theorem  holds.  (As  is  typical,  computation  of  these  attack  input  sets  involves  a  backward  recursion). 

The  preceding  theorem  defines  a  necessary  and  sufficient  set-theoretic  condition  on  the  attacker’s  control  law  given  an  initial  and  terminal 
attack  time  9  and  r  respectively.  Note  that  the  time  length  r  —  9  obviously  plays  a  role  in  whether  or  not  the  condition  of  the  preceding 
theorem  will  hold.  This  preceding  result  leads  to  a  straightforward  definition  concerning  robust,  secure  control. 

Definition  7.  If  $xg  £  J-e-i(u0_i)  such  that  3k  £  { 9 , . . . ,  r  —  1}  with  Z*k  =  0  then  the  system  is  said  to  be  securely  controllable. 


Consider  a  set  function 


Vk{N,C,M)  =  {x£M  : 

fk{yt,C)  0  9k{m)CM}  (17) 

of  those  x  £  AT  such  that  (/fc(x,  u)  +  gk(yvk))  £  M  for  every  input  u£f  and  Vwt  £  W4. 

Theorem  2.  Suppose  at  time  9  that  xg  £  Xg  C  Sg.  Let  Tg-i(Zg-i)  =  Xg  for  notational  simplicity.  If 

Tt(Tt-i(Zt-i)  IT  St,Zt,St+ i)  j=  Tt-i(Zt-i)  IT  St  (18) 

for  some  t  >  9  then  the  system  is  securely  controllable  so  long  as  t  <  t. 

The  preceding  theorem  is  the  main  result  and  provides  a  condition  for  a  system  designer  which  results  in  a  necessary  and  sufficient  lower 
bound  on  the  number  of  consecutive  attack  signals  required  to  compromise  the  safety  of  the  system. 

Corollary  2.  Suppose  at  time  9  that  xg  £  Xg  C  Sg.  Let  Tg-i(Zg-i)  =  Xg  for  notational  simplicity.  Consider  the  optimization  problem 

r*  =  argmax  r  (19) 


such  that  p  I  P|(Jri_i(^_i)  n<S*) 

\t=e 

—p  ( Vt{Tt-i(Zt-i)  n  St.,  Zt, St+1 


>  0 


where  the  operation  p(-)  gives  the  Lebesque  measure  of  its  argument.  Then  there  exists  a  suitable  attack  sequence  of  length  r*  —  9  that 
can  compromise  the  safety  of  the  system  at  time  t*.  Conversely,  there  is  no  attack  sequence  of  a  shorter  duration  that  can  compromise  the 
safety  of  the  system. 


The  preceding  two  results,  for  example,  provide  computable  functions  on  the  known  system  parameters,  e.g.  the  model  and  the  input 
constraints  etc,  which  can  aid  the  system  designer  in  securing  the  control  system.  In  practice,  if  an  attacker  wanted  to  drive  the  state  outside 
some  safety  region  and  the  attacker  had  an  arbitrarily  long  time  period  within  which  to  distort  the  system  controller  then  (apart  from  some 
special  cases)  it  is  unreasonable  to  expect  this  to  be  impossible. 

The  preceding  two  results  lead  to  a  necessary  and  sufficient  lower  bound  on  the  time  period  in  which  an  attacker  can  take  the  state  from  a 
safety  region  to  a  unsafe  region.  The  system  designer  can  then  design  an  attack  detection  algorithm  or  a  monitoring  process  etc  that  attempts 
to  verify  the  integrity  of  the  controller  within  this  time  period.  It  has  also  been  discussed  in,  e.g.,  [8]  that  an  attacker  may  only  intermittently 
be  able  to  distort  a  control  signal.  Thus,  a  designer  may  be  able  to  configure  the  system  such  that  the  lower  bound  introduced  in  this  paper 
is  increased. 
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V.  Concluding  Remarks 

This  paper  briefly  explored  the  classical  notion  of  control  system  reachability  as  applied  within  a  secure  control  theoretic  framework. 

Note  that  all  the  set-theoretical  results  introduced  in  this  paper  are,  in  principal,  computable;  e.g.  it  is  possible  to  design  controller  and  attack 
input  sets  that  drive  the  state  of  the  system  to  the  respective  target  sets  (so  long  as  those  sets  are  reachable).  For  linear  time- varying  systems, 
where  the  control,  attack  and  disturbance  sets  are  ellipsoidal  etc,  the  set- theoretical  results  introduced  in  this  work  can  be  well-approximated 
using  ellipsoidal  calculus. 
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Abstract — This  paper  outlines  the  problem  of  multi-static 
Doppler-based  target  position  and  velocity  estimation.  The  Fisher 
information  matrix  is  derived  given  a  separate  target  illuminator 
and  then  given  a  target-based  isotropic  signal  emission.  Some  re¬ 
marks  concerning  the  Cramer-Rao  inequality  and  its  relationship 
to  the  estimation  problem  are  given.  Some  results  concerning  the 
placement  of  the  receivers  are  given  and  some  open  problems 
are  discussed. 


I.  Introduction 


Using  Doppler-shifts  for  position  and  velocity  estimation 
has  a  long  history;  see  e.g.  [1] — [4].  Consider  a  scenario  with 
ns  sensors,  and  n0  transmitters;  see  Figure  1.  The  target  has 
an  unknown  position  and  velocity  x  =  [pT  vT]T  £  R4.  The 
position  of  the  sensors  is  given  by  s,;  =  [.s7  l  sii2]T  £  R2,  Vi  £ 
{1, . . . ,  ns}  while  the  position  of  the  transmitters  is  given  by 
o  =  [oi  02 ] 1  £  R2.  We  let  <A  G  [0,  2n)  denote  the  bearing  to 
the  target  at  the  ith  sensor  and  6t  £  [0,  2i r)  denote  the  bearing 
to  the  target  at  the  transmitter.  The  angle  subtended  at  the 
target  by  two  sensors  i  and  j  is  denoted  by  dij  =  dji  £  [0, 7r] 
similarly  the  angle  subtended  at  the  target  by  two  transmitters 
i  and  j  is  denoted  by  ipjj  =  ipjj  £  [0, 7r] . 

The  measured  Doppler-shift  is  f,j  at  the  ith  sensor  and  is 
caused  by  a  target  reflection  due  to  a  signal  generated  by  the 
jth  transmitter.  This  frequency  shift  can  be  approximated  by 


—  fij  ft  U-h,j 

=  Uil  f  (P  -  Sj)T  (p-O j)T\ 

2c  v  Up  —  silt  Up -°.?  11/ 


=  Cu(p)v  +  wu 


V  +  Wij 


(la) 

(lb) 

(lc) 


where  c  is  the  speed  of  light  (or  signal  propagation)  and  ||-|| 
is  the  standard  Euclidean  vector  norm  and  fc>jj  is  the  carrier 
frequency  of  employed  by  this  transmitter  sensor  pair.  Finally, 
w.ij  is  a  zero  mean  Gaussian  random  variable  with  known 
variance  cr V .  For  notation  simplicity,  we  employ  the  the  two 
dimensional  index  A  =  (j,  i)  and  following  normalization 
=  1.  The  set  of  all  transmitter/sensor  pairs  is 
denoted  A  C  [l,ns]  x  [l,n0]  and  we  employ  a  lexical  order; 
(A(l),  A(2), . . . ,  \{n))  =  A,  where  n  is  the  total  number  of 
transmitter/sensor  pairs. 

In  the  following  we  delineate  four  different,  but  related, 
RADAR/sensor  network  configurations.  Our  focus  is  on  the 
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Fig.  1.  A  typical  multi-static  target  state  estimation  scenario  with  one  target, 
two  sensors  and  two  transmitters. 


types  of  measurements  generated  by  the  sensor  and  the  time- 
energy  characteristic  of  each  configuration.  The  speed  of  signal 
propagation  in  all  scenarios  is  a  known  constant  c. 

1)  Passive:  Here  the  target  itself  transmits  a  continuous- 
wave  signal  at  some  frequency  fc  which  is  received  by 
n  sensors  in  the  environment.  Sensor  measurements  are 
some  combination  of  target  bearing  and  Doppler  fre¬ 
quency  shift.  Bearing  measurements  require  a  hardware 
array  of  antennas.  The  advantage  of  Doppler-only-based 
sensors  lies  in  the  simplicity  of  the  sensors  since  no 
hardware  array  is  needed  to  determine  the  frequency  shift. 
The  obvious  advantage  of  passive  tracking  systems  is 
that  a  passive  sensor  does  not  indicate  its  location  to  any 
electronic  listening  devices.  However,  in  contrast  to  the 
active  sensor  network  techniques  described  later,  there 
are  numerous  disadvantages  to  passive  sensor  networks: 

a)  Unless  the  target  is  friendly,  knowledge  of  the  carrier 
frequency  fc  is  not  immediate  and  must  be  estimated. 

b)  High  value  and/or  intelligent  targets  will  only  transmit 
occasionally  -  if  at  all,  and  may  vary  the  transmitted 
waveform  (and  hence  fc )  on  each  occasion. 

c)  The  energy  received  by  the  sensor  network  is  dictated 
by  the  non-cooperating  target. 

2)  Mono-static:  Here  each  transmitter  illuminates  the  entire 
area  of  interest  and  each  transmitter  is  paired  with  a 
particular  sensor.  Each  transmitted  signal  is  reflected 
by  the  target  and  received  by  the  prescribed  sensor. 
The  configuration  is  called  mono-static  whenever  each 


transmitter/sensor  pair  is  collocated.  The  advantage  of  all 
active  sensor  networks  (mono-static  or  otherwise)  is  that 
-  in  addition  to  angular  and  Doppler  information,  target 
range  information  is  available  to  the  sensors.  In  contrast 
to  passive  sensor  networks,  the  transmitter  frequency 
fc  is  assumed  to  be  known  to  the  receiving  sensor. 
Moreover,  the  transmitted  signal  and  illumination  pattern 
can  be  sophisticated  in  its  design.  These  system  design 
parameters  are  at  the  discretion  of  the  sensor  network. 

3)  Bi-static:  A  bi-static  sensor  network  is  similar  to  a  mono¬ 
static  network  except  that  each  of  the  transmitter/sensor 
pairs  are  not  collocated.  Bi-static  networks  can  exploit 
network  geometries  by  deploying  the  sensors  “near”  the 
target,  thus  improving  the  received  signal-to-noise  ratio. 
This  is  turn  improves  the  accuracy  and  resolution  capabil¬ 
ities  of  the  network.  Furthermore,  in  an  electronic  warfare 
context,  a  bi-static  network  does  not  give  away  the  loca¬ 
tion  of  any  of  its  sensors  to  electronic  listening  devices. 
The  transmitted  signal  can  encode  additional  information 
such  as  the  transmitter’s  location,  etc.  Alternatively,  the 
illuminator  may  be  a  so-called  transmitter-of-opportunity 
such  as  a  television  broadcaster  etc. 

4)  Multi-static:  Each  transmitter  j  £  \  1 ,  n„]  illuminates 
some  portion  of  the  area  of  interest,  which  is  then  re¬ 
flected  by  the  target  and  received  at  a  number  rij  £  [1,  ns] 
of  sensors.  By  employing  modern  Digital  Array  RADAR 
(DAR)  techniques,  two  or  more  transmitters  and/or  two 
or  more  receivers  may  in  fact  be  collocated.  A  bi¬ 
static  network  is  obviously  a  special  case  of  multi-static 
network. 

In  summary  we  note  the  following: 

1)  Active  transmission  imposes  considerable  energy  require¬ 
ments.  This  is  particularly  troublesome  in  a  mobile  sensor 
network  where  the  weight  and  energy  consumption  of 
each  sensor  needs  to  be  managed  efficiently. 

2)  By  employing  sensors  that  measure  Doppler-shift-only, 
the  complexity  in  the  the  receivers  is  typically  much  less 
than  those  sensors  that  measure  bearing  and/or  range. 

3)  Target  Doppler  velocity  measurements  are  easy  to  gener¬ 
ate  in  an  active  network. 

4)  Bi-static  and  multi-static  sensor  networks  can  exploit  ge¬ 
ometries  to  improve  signal-to-noise  ratio  characteristics. 

5)  In  bi-static  and  multi-static  sensor  networks  the  receivers 
are  passive  sensors  in  this  scenario  and  have  reduced 
energy  requirements  in  comparison  to  the  illuminator. 
In  these  networks  the  system  designer  can  seek  the 
advantages  of  both  active  and  passive  radar  systems  while 
minimizing  the  disadvantages  of  both  individually. 


II.  Problem  Scenario 


A.  Multi-Static  Scenario 


From  (lb)  we  have 

h  =  +  ua2)Tv  +  Wij  (2a) 


where 


uax  := 


Ua2  := 


P  -  s, 

Up  ^  s,;  | 

P  ~  0.7 

Ip -Of  I 


cos((/>x1 ) 
sin (</>X!  ) 

cos  (S\2) 
sin(0x2) 


(2b) 

(2c) 


are  the  unit  vectors  directed  towards  the  target  from  the  sensor 
and  transmitter  respectively.  Stacking  the  measurements  from 
all  transmitter/sensor  pairs  we  obtain 


f  =  f  +  w  (3) 

=  [/a(1))/a(2)j  •  •  •  >/a(ti)]  +  [WA(1)>  •  •  •  >  ^A(n)] 

(4) 


which  obeys  f  ~  A/"(f,  R/)  with  covariance  matrix  R f  = 
diag(°A(i)’  •  •  •  >  °a („))  Note  that 

f  =  diag(CA(i)  (p),  •  •  • ,  Ca(h)  (p)) v  +  w  (5a) 
=  H(p)v  +  w  (5b) 


which  highlights  the  fact  f  is  a  vector  of  nonlinear  measure¬ 
ments  in  p  but  is  linear  in  the  target  velocity  v. 

Note  that  the  illuminators  in  this  scenario  can  be  quite 
sophisticated  in  their  design  of  the  signal  waveform  and  illumi¬ 
nation  pattern.  For  example,  the  illumination  signal  can  encode 
the  position  of  the  transmitter  along  with  other  relevant  signal 
information.  Alternatively,  the  illuminator  may  be  a  so-called 
transmitter-of-opportunity  such  as  a  television  broadcaster  or 
cellular  phone  tower.  The  receivers  are  passive  sensors  in  this 
scenario  and  can  be  relatively  simple  devices  in  comparison  to 
the  illuminator.  This  scenario  arguably  seeks  the  advantages 
of  both  active  and  passive  radar  systems  while  minimizing  the 
disadvantages  of  both  individually. 

We  note  some  practical  problems  that  arise  in  multi-static 
scenarios  owing  to  the  distributed  sensing  nature  of  multi-static 
configurations.  Firstly,  difficulties  with  incoherence  when  each 
sensor  is  running  a  local  oscillator  independently  of  the  other 
sensors  and  even  the  illuminator  may  exist.  Furthermore, 
signal  measurements  may  arrive  at  different  sensors  asyn¬ 
chronously.  In  addition,  the  sensors  must  communicate  either 
to  a  central  estimator  or  amongst  themselves  in  order  to  make 
use  of  the  received  measurements.  Asynchronous  polling, 
communication  bandwidths,  energy  consumption  etc  can  all 
contribute  to  significant  (often  underestimated)  communication 
problems  with  multi-static  networks  in  practice. 


B.  Mono-Static  Scenario 


As  noted  in  the  previous  section  we  consider  broadly  three 
different  network  configurations. 


In  a  mono-static  configuration  we  have  an  equal  num¬ 
ber  of  sensors  and  transmitters  ns  =  n0  :=  n  and  each 


sensor/transmitter  pair  is  collocated.  The  following  equality 
conditions  are  immediate:  for  each  A  €  A 

uAl  =uAa  (=:  ua)  (6a) 

^x,=0x  2  (6b) 

fA=UAV  +  w  (6c) 

C.  Passive  Configuration 


A.  General  Results  Concerning  Each  Scenario 

The  Fisher  information  matrix  is  given  by  (11).  We  use 
Z(x),  Z(p)  and  I(v)  to  denote  the  Fisher  information  matrix 
defined  by  considering  only  the  parameters  x,  p  and  v 
respectively.  Both  Z(p)  and  X(v)  turn  out  to  be  principal  sub¬ 
matrices  of  Z(x).  In  all  cases,  independent  measurements  from 
additional  sensors  in  general  positions  will  never  decrease  the 
total  information  in  each  x,  p  and  v. 


In  this  configuration  we  have  n  =  ns  and  the  target  itself  is 
the  (isotropic)  emitter.  Observe  that  the  measurement  equation 
(2)  for  the  passive  configuration  coincides  with  the  mono-static 
case  (6c).  Geometrically  speaking  this  should  be  no  suprise  as 
each  passive  sensor  measures  only  that  portion  of  the  isotropic 
emission  that  is  directed  towards  the  target. 

III.  The  Cramer-Rao  Inequality 


Proposition  1.  The  condition  n  >  4  is  a  necessary  condition 
for  T  (x)  to  be  non-singular. 

Proof:  Recall  thatZ(x)  =  Vxf(x)TR^'1Vxf  (x)  or  given 
Rf  =  diag  (of, . . . ,  of)  we  have 

n 

Z(x)  =  ^^vxf7-vxfi  (12) 


If  Z(x)  is  the  Fisher  information  matrix  then  the  Cramer- 
Rao  inequality  lower  bounds  the  variance  achievable  by  an 
unbiased  estimator.  For  an  unbiased  estimate  x  of  x  we  find 

E[(x-x)(x-x)  ]  >  J(x)^1  (7) 


If  Z(x)  is  singular  then  (in  general)  no  unbiased  estimator 
for  x  exists  with  a  finite  variance  [5],  If  (7)  holds  with 
equality,  for  some  unbiased  estimate  x,  then  the  estimator  is 
called  efficient  and  the  parameter  estimate  x  is  unique  [5], 
However,  even  if  Z(x)  is  non-singular  then  it  is  not  practically 
guaranteed  that  an  unbiased  estimator  can  be  recognized. 
Alternatively,  if  an  unbiased  estimator  can  be  realized,  it  is  not 
guaranteed  that  an  efficient  estimator  exists  [6].  The  design  of 
specific  estimation  algorithms  is  not  the  immediate  goal  of  this 
work.  However,  it  is  obvious  that  an  unbiased  estimator  5(f) 
of  x  is  one  which  obeys 


%(?)] 


1 

(27r)ra/2|Rf  I1/2 


dw 


where  again  f  =  f  +  w.  Starting  with  this  constraint,  there  are 
a  number  of  strategies  of  designing  unbiased  estimators;  see 
[6].  We  will  not  explore  this  concept  further.  The  condition  (7) 
says  nothing  about  the  performance  and  realizability  of  biased 
estimators.  That  is,  in  order  to  use  (7)  we  must  consider  only 
unbiased  estimators  [5],  The  ( i,j)th  element  of  Z(x)  is  given 
by 


Zi,j  (x)  =  E 


ln  (/p(p;  x))  7^: ln  (/p(p;  x)) 


dxj 


(9) 


where  [pT  vT]T  =  [x\  ...  xf\  £  R4  and  /f(f;x)  is  the 
Gaussian  likelihood  function.  We  then  easily  find  Z(x)  = 
Vxf (x)TRiT1  Vxf(x).  The  Fisher  information  metric  charac¬ 
terizes  the  nature  of  the  likelihood  function.  If  the  likelihood 
function  is  sharply  peaked  then  the  true  parameter  value  is 
easier  to  estimate  from  the  measurements  than  if  the  likelihood 
function  is  flatter. 


which  is  a  sum  of  matrices  each  with  rank  at  most  1.  Now  a 
well-known  result  states  that  a  rank-fc  matrix  can  be  written  as 
the  sum  of  k  rank-1  matrices  but  not  fewer.  This  immediately 
implies  our  result  and  completes  the  proof.  ■ 

Proposition  2.  The  condition  n  >  2  is  a  necessary  condition 
for  T  (p)  to  be  non-singular. 

Proposition  3.  The  condition  n  >  2  is  a  necessary  condition 
for  T  (v)  to  be  non-singular. 

Proposition  4.  The  following  statements  concerning  efficient 
estimators  hold. 

1)  If  n  is  finite  then  no  efficient  estimator  exists  for  x. 

2)  If  p  is  known  and  n  is  finite  then  an  efficient  estimator 
for  v  exists  and  is  given  by  the  standard  linear  maximum 
likelihood  estimator. 

3)  If  v  is  known  and  n  is  finite  then  no  efficient  estimator 
exists  for  p. 

Proof:  This  result  follows  from  a  general  result  concern¬ 
ing  efficient  estimators  given  in  [6],  ■ 

We  do  not  consider  the  design  of  unbiased  (but  inefficient) 
estimators  for  either  p  or  x  in  this  work.  However,  we  know 
that  no  unbiased  estimator  for  x  exists  with  a  finite  variance 
when  n  <  4.  Similarly,  no  unbiased  estimator  for  p  or  v  exists 
with  a  finite  variance  when  n  <  2. 

B.  Discussion  on  the  Cramer-Rao  Bound 

The  Cramer-Rao  inequality  assumes  an  unbiased  estimation 
algorithm  and  an  estimator  which  achieves  the  inequality  is 
called  an  efficient  estimator.  An  efficient  estimator  does  not 
exist  for  x  or  p  when  n  is  finite  but  does  exist  for  v  when  p  is 
given.  Even  if  an  efficient  estimator  does  not  exist  then  it  may 
be  possible  to  design  an  unbiased  estimator.  This  possibility  is 
not  explored  in  this  work.  In  practice,  a  system  designer  may 
be  constrained  in  their  choice  of  parameter  estimator.  Likely, 
the  estimation  technique  used  in  practice  will  be  biased  [7], 
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[8].  For  example,  even  the  well-known  maximum  likelihood 
localization  techniques  are  only  asymptotically  unbiased  and 
efficient,  i.e.  require  the  number  of  sensors  to  approach  infinity. 
The  Cramer-Rao  bound  for  unbiased  estimators  is  still  an 
interesting  benchmark  with  which  intuitively  pleasing  results 
and  performance  measures  can  be  derived.  However,  these 
results  can  only  be  considered  as  a  guide. 

It  is  interesting  that  the  variance  (or  mean-square-error)  of 
an  estimate  can  sometimes  be  made  smaller  at  the  expense  of 
increasing  the  bias  [9],  The  work  of  [10],  [11]  explores  the 
concept  of  bias-variance  trade  offs  in  estimation.  In  [5],  [11] 
a  biased  Cramer-Rao  inequality  and  in  [10],  [11]  a  uniform 
Cramer-Rao  inequality  are  developed  and  can  be  used  to  study 
this  so-called  bias-variance  trade  off.  These  ideas  are  yet  to  be 
fully  explored  in  the  localization  and  target  tracking  literature. 

IV.  On  the  Fisher  Information  for  Velocity 
Estimation 

Doppler-based  measurements  are  often  used  to  estimate  the 
target  velocity  alone.  In  this  section  we  explore  the  relationship 
between  the  transmitter  and  sensor  positions  and  the  velocity 
estimation  error  lower  bound  defined  by  X\  (v).  To  this  end, 
consider  the  sub-matrix 


W  =  £ 


Px 


cos2  (^1+^2) 


sin(0Al+SA2) 
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sinQAl+gA2) 

sin2 


(13) 


Proof:  Firstly  we  write, 

I(V)  =  : 


b  = 


c  = 


d  = 


E- 

A 

£* 

A 


(l  +  COs(0Al  +  0\2)) 
f|sin(0Al  +6>aJ 

°  A 

E  ff  sinOAl +°a 

A 

-  cos(^Xl  +0\2)) 


(16) 

(17) 

(18) 

(19) 

(20) 


Applying  the  determinant  formula  det  [  “  $  ]  =  ad  —  be  we 


obtain  two  terms;  the  first  term  “ad”  may  be  reduced  by 
completing  the  square: 


First  Term  = 


(Eff)  -(E|cos(^+^))  (21) 


where  p\  :=  cos2  ( 


Observe  that  the  above  is  the  first  two  terms  in  (15).  The  third 
term  in  (15)  is  immediate  from  the  second  term  “be”  in  the 
determinate  formula.  ■ 

In  the  mono-static,  or  passive  scenario,  with  a i  =  <jj  =  n 
the  following  corollary  is  useful. 

Corollary  1.  Let  </>,,  V*  €  {1, . . . ,  n}  denote  the  angular  posi¬ 
tions  of  the  sensors.  The  following  are  equivalent  expressions 


and  /c/c  1-  This  simplifies  to  jor  ple  pisjler  information  determinant  det  (I(v)) 


I(v)  ”  h 
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2e 

i—  1 


cos2  (00  sh^tl 
si^  sin2  (00 


(14) 


(i).  det  (I(v))Ex 


in  the  mono-static  or  passive  scenario  with  cr*  =  a j  =  a  for 
all  i,j  and  fc/c  =  1. 

Theorem  1.  The  determinant  of  the  Fisher  information  matrix 
IA(v)  in  (13)  is  given  by 


-  £  cos(20O  -  £  sin(20O 


(22) 


1 


(ii).  det  (X(v))  =  —  2^  sin  (<f>j  -  00,  j  >  i  (23) 


det(2(v))  =  t 


(£ 


Px 


AeA 


(E  cos(0Al  +  d\2)j  — 

AeA 

(E§sin(0Al  +0A2)) 


where  p\  :=  cos2(^4i_^2 


where  S  =  {{*,  j}}  is  defined  as  the  set  of  all  combinations 
of  i  and  j  with  i,j  £  {1, . . . ,  n}  and  j  >  i. 

(15)  Proof:  Part  (ii)  follows  from  [12],  ■ 

In  both  the  multi-static  and  mono-static  scenarios  we  state 
that  angular  sensor  positions  which  maximize  the  determinant 


(15)  or  (22)  respectively  generate  an  optimal  sensor-target 
geometry  for  velocity  estimation  using  Doppler. 

Theorem  2.  For  the  multi-static  sensor  configuration  with 

<j\  =  a2  the  Fisher  information  determinant  (15)  is  upper 

2 

bounded  by  -jLj.  This  bound  is  achieved  if  and  only  if  the 
angular  positions  of  the  sensors  and  transmitters  are  such  that 

VA  £  A  :  4>x1  =  0\2mod  27t  (24) 

and 

cos(2 (f>\)  =  0  and  ^  sin(2^)  =  0  (25) 

AeA  AeA 

holds. 

Theorem  3.  For  the  passive  sensor  configuration  with  a2  = 
a2,  the  Fisher  Information  determinant  (22)  is  upper  bounded 
by  This  upper  bound  is  achieved  if  and  only  if  the  angular 
positions  of  the  sensors  are  such  that  the  equations 

cos(2 fix)  =  0  and  sin(2</>},)  =  0  (26) 

AeA  AeA 


hold. 

The  remainder  of  this  section  focuses  on  deriving  and 
understanding  certain  physical  consequences  of  Theorem  2. 
Of  course  these  results  apply  also  to  Theorem  3. 

Proposition  5.  The  following  actions  do  not  affect  the  value 
of  the  Fisher  information  determinant:  (i)  changes  in  the  true 
individual  sensor-target  ranges,  i.e.  moving  a  sensor  from  s,  to 
p  +  A:(s,  —  p)  for  some  k  >  0;  and  (ii)  reflecting  a  sensor  about 
the  emitter  position,  i.e.  moving  a  sensor  from  s,  to  2p  —  s,. 

Part  (ii)  of  the  preceding  proposition  is  illustrated  in  Figure 
2.  In  particular.  Figure  2  illustrates  two  scenarios  obtained 
from  each  other  by  reflecting  a  particular  sensor  about  the 
emitter  position. 


Fig.  2.  This  figure  illustrates  two  scenarios  obtained  from  each  other  by 
reflecting  a  particular  sensor  about  the  emitter  position.  This  reflection  does 
not  affect  the  optimality  of  the  sensor-target  configuration. 


Proposition  6.  Let  the  angle  subtended  at  the  target  by  two 
sensors  i  and  j  be  denoted  by  '0,:l  =  'dri .  One  set  of  solutions 
to  (25)  is  characterized  by 

dij  =  dji  =  -7T  (27) 

n 


for  all  adjacent  sensor  pairs  i,  j  £  {1, . . .  ,n  >  4}  with  \j  —  i\  = 
1  or  |j  —  *|  =  n  —  1,  and  then  by  a  possible  application  of 
Proposition  5  on  (27). 

Proof:  The  proof  of  this  proposition  is  straightforward 
and  involves  verifying  that  (27)  satisfies  (25)  of  Theorem  2. 
Then,  sensor  reflections  as  detailed  in  Proposition  5  do  not 
change  the  value  of  the  determinant.  Further  details  are  omitted 
for  brevity.  ■ 

Corollary  2.  Consider  the  angles  i),:J  =  i):r,  subtended  at  the 
target  by  two  adjacent  sensors  where  adjacency  implies  \  j  — 
i\  =  1  or  |j  —  z|  =  n  —  1  with  i,j  £  {1, . . .  ,n  >  4}.  Then 
dij  =  ^7r  and  flij  =  are  two  separate  solutions  to  (25). 

Consider  now  n  >  4  sensors  and  denote  the  set  of  sensors 
by  V.  Now  assume  that  V  can  be  partitioned  into  some 
arbitrary  number  m  of  subsets  Bi  such  that  Bi  CiBj  =0  and 
\Bi\  >  2,  \/i,j  £  {1, . . .  ,m}  with  i  j.  Then  the  following 
corollary  follows  directly  from  Theorem  2. 

Corollary  3.  Given  an  arbitrary  number  n  >  4  of  sensors, 
then  a  solution  to  (25)  can  be  obtained  by  arranging  all  subsets 
of  sensors  Bi,  \/i  £  {1  such  that  the  condition  of 

Theorem  2  is  satisfied  independently  for  those  sensors  in  Bi, 

Vi  e  {l, ,  m}. 

When  2  <  n  <  4  the  optimal  sensor-target  angular 
configuration  for  Doppler-based  velocity  estimation  is  unique 
up  to  sensor  reflections  about  the  target.  When  n  >  4  there 
exists  an  infinite  number  of  optimal  configurations  obtained 
by  optimally  placing  disjoint  subsets  of  sensors  with  cardi¬ 
nality  >  2.  Corollary  3  is  useful  in  forming  optimal  sensor 
configurations  with  an  arbitrary  number  n  >  4  of  sensors.  We 
demonstrate  the  flexibility  this  affords  a  designer  in  Figure  3. 


Fig.  3.  Consider  a  mono-static  or  passive  scenario  involving  n  =  6  sensors. 
We  can  partition  the  sensors  into  two  disjoint  sets  B\  =  {1,  2,  3}  and  B2  = 
{4,5,6}.  This  figure  illustrates  a  number  of  scenarios  where  each  disjoint 
subset  of  sensors  is  independently  placed  in  an  optimal  configuration.  Each 
scenario  is  optimal. 
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Figure  3  illustrates  how  disjoint  subsets  of  sensors  can  be 
optimally  and  independently  placed  with  respect  to  each  other 
while  still  maintaining  a  globally  optimal  geometry  in  the 
sense  that  the  angular  positions  obey  (25)  in  Theorem  2. 

V.  Conclusion 

In  this  paper  we  have  outlined  the  problem  of  multi¬ 
static  Doppler-based  target  position  and  velocity  estimation. 
The  Fisher  information  matrix  was  derived  given  a  separate 
target  illuminator  and  then  given  a  target-based  isotropic 
signal  emission.  Some  remarks  concerning  the  Cramer-Rao 
inequality  and  its  relationship  to  the  estimation  problem  were 
given.  For  the  case  of  Doppler-based  target  velocity  estimation 
we  completely  characterized  the  sensor-target  geometry  and 
provided  a  number  of  conditions  on  the  optimal  placement  of 
the  sensors  and  the  transmitters. 
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Contractions  for  Consensus  Processes* 
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Abstract — Many  distributed  control  algorithms  of  current 
interest  can  be  modeled  by  linear  recursion  equations  of  the 
form  x(t  +  1)  =  M(t)x(t),  t  >  1  where  each  M(t)  is  a  real¬ 
valued  “stochastic”  or  “doubly  stochastic”  matrix.  Convergence 
of  such  recursions  often  reduces  to  deciding  when  the  sequence 
of  matrix  products  M(1),M(2)M(1),  M(3)M(2)M(1), . . .  con¬ 
verges.  Certain  types  of  stochastic  and  doubly  stochastic  matri¬ 
ces  have  the  property  that  any  sequence  of  products  of  such  ma¬ 
trices  of  the  form  Si,  S2S1,  S3S2S1, . . .  converges  exponentially 
fast.  We  explicitly  characterize  the  largest  classes  of  stochastic 
and  doubly  stochastic  matrices  with  positive  diagonal  entries 
which  have  these  properties.  The  main  goal  of  this  paper  is 
to  find  a  “semi-norm”  with  respect  to  which  matrices  from 
these  “convergability  classes”  are  contractions.  For  any  doubly 
stochastic  matrix  S  such  a  semi-norm  is  identified  and  is  shown 
to  coincide  with  the  second  largest  singular  value  of  S. 

I.  Introduction 

Many  distributed  control  algorithms  of  current  interest  can 
be  modeled  by  linear  recursion  equations  of  the  form 

x(t  +  1)  =  M(t)x(t),  t>  1  (1) 

where  each  M{t)  is  a  real-valued  “stochastic”  or  “doubly 
stochastic”  matrix.  Among  these  are  consensus  and  flocking 
algorithms  [2]— [8],  distributed  averaging  algorithms  [9] — [1 1], 
and  certain  types  of  gossiping  algorithms  [  12]— [  14] .  Recur¬ 
sion  equations  like  this  have  their  roots  in  the  literature  on 
nonhomogeneous  Markov  chains  [15].  While  much  is  known 
at  this  point  about  conditions  on  the  M(t)  for  solutions 
to  converge  to  a  limit  point,  considerably  less  is  known 
about  the  rates  at  which  such  solutions  converge.  There  are 
classical  concepts  such  as  the  coefficient  of  ergodicity  [15] 
which  are  helpful  in  deriving  convergence  rates,  but  these  are 
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limited  to  only  certain  types  of  processes.  The  convergence 
rate  question  has  been  studied  recently  in  [1],  [11],  [16], 
[17].  In  [9],  [11]  convergence  rate  results  are  derived  for 
distributed  averaging  algorithms.  In  [12]  the  question  is 
addressed  for  probabilistic  gossiping  algorithms.  A  modified 
gossiping  algorithm  intended  to  speed  up  convergence  is 
proposed  in  [18]  without  proof  of  correctness,  but  with 
convincing  experimental  results.  The  algorithm  has  recently 
been  analyzed  in  [19].  Recent  results  concerning  convergence 
rates  appear  in  [13],  [20]— [22]  for  periodic  gossiping  and  in 
[1],  [11],  [23]  for  deterministic  aperiodic  gossiping. 

Certain  types  of  stochastic  and  doubly  stochastic  matrices 
have  the  property  that  any  sequence  of  products  of  such 
matrices  of  the  form  Si,  S2S1,  S3S2S1,  •  ■  •  converges  expo¬ 
nentially  fast.  In  Section  II  we  explicitly  characterize  the 
largest  classes  of  stochastic  and  doubly  stochastic  matrices 
with  positive  diagonal  entries  which  have  these  properties. 
We  call  these  classes  “convergable”.  The  main  goal  of  this 
paper  is  to  find  a  “semi-norm”  with  respect  to  which  matrices 
from  these  convergability  classes  are  contractions.  The  role 
played  by  semi-norms  in  characterizing  convergence  rates 
is  explained  in  Section  III.  Three  different  types  of  semi¬ 
norms  are  considered.  Each  is  compared  to  the  well  known 
coefficient  of  ergodicity  which  plays  a  central  role  in  the 
study  of  convergence  rates  for  nonhomogeneous  Markov 
chains  [15].  Somewhat  surprisingly,  for  doubly  stochastic 
matrices  it  turns  out  that  a  particular  Euclidean  semi-norm  on 
IR" x  ”  has  the  required  property  -  namely  that  in  this  semi¬ 
norm,  any  doubly  stochastic  matrix  S  in  the  convergability 
class  of  all  doubly  stochastic  matrices  is  a  contraction.  This 
particular  semi-norm  turns  out  to  be  the  second  largest 
singular  value  of  S. 

A.  Stochastic  and  Doubly  Stochastic  Matrices 

The  type  of  matrices  typically  encountered  in  a  consensus 
process  [4]  modeled  by  (1)  have  only  nonnegative  entries  and 
row  sums  all  equal  one.  Matrices  with  these  properties  are 
called  stochastic.  Doubly  stochastic  matrices  are  stochastic 
matrices  with  the  additional  property  that  their  column 
sums  are  also  all  equal  to  one.  Doubly  stochastic  matrices 
are  typically  encountered  when  (1)  represents  a  distributed 
averaging  [9]  or  gossiping  [12]  process.  It  is  easy  to  see 
that  a  nonnegative  matrix  S  is  stochastic  if  and  only  if 
ST  =  1  where  1  £  IRra  is  a  column  vector  whose  entries 
are  all  ones.  Similarly  a  nonnegative  matrix  S  is  doubly 
stochastic  if  and  only  if  SI  =  1  and  S' 1  =  1.  Using 
these  characterizations  it  is  easy  to  prove  that  the  class  of 
stochastic  matrices  in  IR”xn  is  compact  and  closed  under 
multiplication  as  is  the  class  of  doubly  stochastic  matrices 
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in  H”xn.  It  is  also  true  that  the  class  of  nonnegative  matrices 
in  IR"xra  with  positive  diagonal  entries  is  closed  under 
multiplication.  Stochastic  and  doubly  stochastic  matrices 
with  positive  diagonal  entries  are  commonly  encountered  in 
the  study  of  consensus  processes;  positive  diagonal  entries 
greatly  simplify  convergence  analysis. 

Mathematically,  reaching  a  consensus  means  that  the  state 
vector  x (t)  appearing  in  (1)  converges  to  a  limit  vector  of  the 
form  al  were  a  is  a  number  depending  on  the  initial  value 
of  x.  This  will  always  be  the  case  if  the  infinite  sequence 
of  matrix  products  M(l),  M(2)M(1),  M(3)M(2)M(1), . . . 
converges  to  a  matrix  of  the  form  lc  in  which  case  a  = 
cx(  1).  It  should  be  clear  from  what  has  just  been  stated 
that  if  the  M(t )  are  all  doubly  stochastic,  then  c  =  ^  1' 
which  means  that  in  this  case  a  is  the  average  of  the 
values  of  the  entries  in  x(l).  Thus  to  study  convergence 
of  the  consensus  process  modeled  by  (1),  it  suffices  to 
study  the  convergence  of  infinite  sequences  of  products  of 
stochastic  and  doubly  stochastic  matrices.  Such  sequences 
are  closely  related  to  what  are  called  “nonhomogeneous 
Markov  chains”  for  which  there  is  a  substantial  literature 
[15].  Notwithstanding  this,  the  following  question  remains. 
What  determines  the  convergence  rate  of  such  sequences? 
This  is  the  question  which  will  be  considered  in  the  sequel. 
We  begin  with  a  few  basic  ideas. 

11.  Graph  of  a  Stochastic  Matrix 

Many  properties  of  a  stochastic  matrix  can  be  usefully 
described  in  terms  of  an  associated  directed  graph  determined 
by  the  matrix.  The  graph  of  nonnegative  matrix  M  €  IRnxrt, 
written  7 (M),  is  a  directed  graph  on  n  vertices  with  an  arc 
from  vertex  i  to  vertex  j  just  in  case  rrij,  0;  if  (i,j)  is 
such  an  arc,  we  say  that  i  is  a  neighbor  of  j  and  that  j  is 
an  observer  of  i.  Thus  7 (M)  is  that  directed  graph  whose 
adjacency  matrix  is  the  transpose  of  the  matrix  obtained  by 
replacing  all  nonzero  entries  in  M  with  ones. 

C.  Connectivity 

There  are  various  notions  of  connectivity  which  are  useful 
in  the  study  of  the  convergence  of  products  of  stochastic 
matrices.  Perhaps  the  most  familiar  of  these  is  the  idea  of 
“strong  connectivity”.  A  directed  graph  is  strongly  connected 
if  there  is  a  directed  path  between  each  pair  of  distinct 
vertices.  A  directed  graph  is  weakly  connected  if  there  is  an 
undirected  path  between  each  pair  of  distinct  vertices.  There 
are  other  notions  of  connectivity  which  are  also  useful  in 
this  context.  To  define  several  of  them,  let  us  agree  to  call  a 
vertex  i  of  a  directed  graph  G,  a  root  of  G  if  for  each  other 
vertex  j  of  G,  there  is  a  directed  path  from  i  to  j.  Thus  i 
is  a  root  of  G  if  it  is  the  root  of  a  directed  spanning  tree  of 
G.  We  will  say  that  G  is  rooted  at  i  if  i  is  in  fact  a  root. 
Thus  G  is  rooted  at  i  just  in  case  each  other  vertex  of  G 
is  reachable  from  vertex  i  along  a  directed  path  within  the 
graph.  G  is  strongly  rooted  at  i  if  each  other  vertex  of  G 
is  reachable  from  vertex  i  along  a  directed  path  of  length  1. 
Thus  G  is  strongly  rooted  at  i  if  i  is  a  neighbor  of  every  other 
vertex  in  the  graph.  By  a  rooted  graph  is  meant  a  directed 


graph  which  possesses  at  least  one  root.  A  strongly  rooted 
graph  is  a  graph  which  has  at  least  one  vertex  at  which  it  is 
strongly  rooted.  Note  that  a  nonnegative  matrix  M  £  IR"X™ 
has  a  strongly  rooted  graph  if  and  only  if  it  has  a  positive 
column.  Note  that  every  strongly  connected  graph  is  rooted 
and  every  rooted  graph  is  weakly  connected.  The  converse 
statements  are  false.  In  particular  there  are  weakly  connected 
graphs  which  are  not  rooted  and  rooted  graphs  which  are  not 
strongly  connected. 

D.  Composition 

Since  we  will  be  interested  in  products  of  stochastic 
matrices,  we  will  be  interested  in  graphs  of  such  products  and 
how  they  are  related  to  the  graphs  of  the  matrices  comprising 
the  products.  For  this  we  need  the  idea  of  “composition”  of 
graphs.  Let  Gp  and  Gg  be  two  directed  graphs  with  vertex 
set  V.  By  the  composition  of  Gp  with  Gq,  written  G?oGp,  is 
meant  the  directed  graph  with  vertex  set  V  and  arc  set  defined 
in  such  a  way  so  that  (i,j)  is  an  arc  of  the  composition 
just  in  case  there  is  a  vertex  k  such  that  (i,  k)  is  an  arc 
of  Gp  and  (k,j)  is  an  arc  of  Gq.  Thus  (i,  j)  is  an  arc  in 
Gq  o  Gp  if  and  only  if  i  has  an  observer  in  Gp  which  is 
also  a  neighbor  of  j  in  Gq.  Note  that  composition  is  an 
associative  binary  operation;  because  of  this,  the  definition 
extends  unambiguously  to  any  finite  sequence  of  directed 
graphs  Gi,  G2,  ■  ■  • ,  G&  with  the  same  vertex  set. 

Composition  and  matrix  multiplication  are  closely  related. 
In  particular,  the  graph  of  the  product  of  two  nonnegative 
matrices  Mi,  M2  €  IRnxn  is  equal  to  the  composition  of 
the  graphs  of  the  two  matrices  comprising  the  product.  In 
other  words,  y^^Mi)  =  7(^/2)  o  j(Mi). 

If  we  focus  exclusively  on  graphs  with  self-arcs  at  all 
vertices,  more  can  be  said.  In  this  case  the  definition  of 
composition  implies  that  the  arcs  of  both  Gp  and  G9  are 
arcs  of  Gq  o  Gp;  the  converse  is  false.  The  definition  of 
composition  also  implies  that  if  Gp  has  a  directed  path 
from  i  to  k  and  Gq  has  a  directed  path  from  k  to  j ,  then 
Gq  o  Gp  has  a  directed  path  from  i  to  j.  These  implications 
are  consequences  of  the  requirement  that  the  vertices  of  the 
graphs  in  question  have  self-arcs  at  all  vertices.  It  is  worth 
emphasizing  that  the  union  of  the  arc  sets  of  a  sequence  of 
graphs  Gi,  G2,  ■  •  • ,  G&  with  self-arcs  must  be  contained  in 
the  arc  set  of  their  composition.  However  the  converse  is 
not  true  in  general  and  it  is  for  this  reason  that  composition 
rather  than  union  proves  to  be  the  more  useful  concept  for 
our  purposes. 

II.  CONVERGABILITY 

It  is  of  obvious  interest  to  have  a  clear  understanding  of 
what  kinds  of  stochastic  matrices  within  an  infinite  product 
guarantee  that  the  infinite  product  converges.  There  are  many 
ways  to  address  this  issue  and  many  existing  results.  Here 
we  focus  on  just  one  issue. 

Let  S  denote  the  set  of  all  stochastic  matrices  in  IR'lXn 
with  positive  diagonal  entries.  Call  a  compact  subset  M.  C 
S  convergable  if  for  each  infinite  sequence  of  matri¬ 
ces  Mi,  M2,  M3,...  from  Ml,  the  sequence  of  products 
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M \.  M 2 M i ,  M 3 M 2 Af j .  converges  exponentially  fast  to 
a  matrix  of  the  form  lc.  Convergability  can  be  characterized 
as  follows. 

Theorem  1:  Let  TZ  denote  the  set  of  all  matrices  in  S  with 
rooted  graphs.  Then  a  compact  subset  M.  C  S  is  convergable 
if  and  only  if  M  C  TZ. 

The  theorem  implies  that  7 Z  is  the  largest  subset  of 
n  x  n  stochastic  matrices  with  positive  diagonal  entries 
whose  compact  subsets  are  all  convergable.  TZ  itself  is  not 
convergable  because  it  is  not  closed  and  thus  not  compact. 

Proof  of  Theorem  1:  The  fact  that  any  compact  subset  of 
TZ  is  convergable  is  more  or  less  well  known  from  the  work 
reported  in  [24];  the  statement  also  follows  from  Proposition 
11  of  [25].  To  prove  the  converse,  suppose  that  A i  C  S 
is  convergable.  Then  by  continuity,  every  sufficiently  long 
product  of  matrices  from  A4  must  be  a  matrix  with  a 
positive  column.  Therefore,  the  graph  of  every  sufficiently 
long  product  of  matrices  from  A4  must  be  strongly  rooted. 
It  follows  from  Proposition  5  of  [25]  that  M  must  be  a  subset 
of  TZ.  U 

Although  doubly  stochastic  matrices  are  stochastic,  con¬ 
vergability  for  classes  of  doubly  stochastic  matrices  has  a 
different  characterization  than  it  does  for  classes  of  stochastic 
matrices.  Let  V  denote  the  set  of  all  doubly  stochastic 
matrices  in  S.  In  the  sequel  we  will  prove  the  following 
theorem. 

Theorem  2:  Let  W  denote  the  set  of  all  matrices  in  V  with 
strongly  connected  graphs.  Then  a  compact  subset  A4  C  V 
is  convergable  if  and  only  if  Ad  C  W. 

The  theorem  implies  that  W  is  the  largest  subset  of  n  x 
n  doubly  stochastic  matrices  with  positive  diagonal  entries 
whose  compact  subsets  are  all  convergable.  Like  TZ ,  W  is 
not  convergable  because  it  is  not  compact.  Results  which 
more  or  less  imply  the  sufficiency  of  strong  connectivity  can 
be  found  in  [24]  and  elsewhere.  Note  that  sufficiency  is  also 
implied  by  Theorem  1  because  doubly  stochastic  matrices 
with  strongly  connected  graphs  are  stochastic  matrices  with 
rooted  graphs.  It  remains  therefore,  to  prove  the  necessity  of 
Theorem  2.  This  will  be  done  in  the  sequel. 

An  interesting  set  of  stochastic  matrices  in  S  whose 
compact  subsets  are  known  to  be  convergable,  is  the  set  of 
all  “scrambling  matrices”.  A  matrix  S  £  S  is  scrambling  if 
for  each  distinct  pair  of  integers  i  and  j,  there  is  a  column 
k  of  S  for  which  slk  and  Sjk  are  both  nonzero  [15].  In 
graph  theoretic  terms  S  is  a  scrambling  matrix  just  in  case 
its  graph  is  “neighbor  shared”  where  by  neighbor  shared  we 
mean  that  each  distinct  pair  of  vertices  in  the  graph  share  a 
common  neighbor  [25].  Convergability  of  compact  subsets 
of  scrambling  matrices  is  tied  up  with  the  concept  of  the 
coefficient  of  ergodicity  [15]  which  for  a  given  stochastic 
matrix  S  £  S  is  defined  by  the  formula 

1  ” 

t(S)  =  -  max^  |s»fe  -  sjk  |  (2) 

2  fc= l 

It  is  known  that  0  <  t(S)  <  1  for  all  S  £  S  and  that 

r(S)  <  1  (3) 


if  and  only  if  S'  is  a  scrambling  matrix.  It  is  also  known  that 

r(S2S!)  <  r(S2)r(S1),  Su  S2  £  S  (4) 

It  can  be  shown  that  (3)  and  (4)  are  sufficient  conditions 
to  ensure  that  any  compact  subset  of  scrambling  matrices  is 
convergable.  But  r(-)  has  another  role.  It  provides  a  worst 
case  convergence  rate  for  any  infinite  product  of  scrambling 
matrices  from  a  given  compact  set  C  C  S.  In  particular,  it  can 
be  easily  shown  that  as  i  — >  oo,  any  product  ■  ■  ■  S2Si 

of  scrambling  matrices  Si  £  C  converges  to  a  matrix  of  the 
form  lc  as  fast  as  A*  converges  to  zero  where 

A  =  max  t(S) 
sec 

This  preceding  discussion  suggests  the  following  question. 
Can  analogs  of  the  coefficient  of  ergodicity  satisfying  formu¬ 
las  like  (3)  and  (4)  be  found  for  the  set  of  stochastic  matrices 
with  rooted  graphs  or  perhaps  for  the  set  of  doubly  stochastic 
matrices  with  strongly  connected  graphs?  In  the  sequel  we 
will  provide  a  partial  answer  to  this  question  for  the  case  of 
stochastic  matrices  and  a  complete  answer  for  the  case  of 
doubly  stochastic  matrices.  Our  approach  will  be  to  appeal 
to  certain  types  of  semi-norms  of  stochastic  matrices. 


III.  Semi-norms 

Let  ||  •  ||p  be  the  induced  p-norm  on  ]Rmxra.  In  this  paper 
we  will  be  primarily  interested  in  the  cases  p  =  1,2,  oo. 
Note  that  for  a  nonnegative  matrix  A 

|A||i  =  max  column  sum  A 

Pl|2  = 

||A||oo  =  maxrowsumi 

where  p(A'  A)  is  the  largest  eigenvalue  of  A' A;  that  is,  the 
square  of  the  largest  singular  value  of  A.  For  any  integer 
p  >  0  and  matrix  M  £  IRmx"  define 

\M\P  =  min  ||M-lc||p 

c£lR1Xn 

As  defined,  |  •  |p  is  nonnegative  and  \M\p  <  ||M||p;  clearly 
\pM\p  =  |/r||M|p  for  all  real  numbers  p  so  |-|pis  “positively 
homogeneous”  [26].  Let  M\  and  M2  be  matrices  in  IRmxn 
and  let  Co,  Ci,  and  c2  denote  values  of  c  which  minimize 
||Mi+M2  —  lc||p,  H-Mi  —  lc||p,  and  ||M2  —  lc||p  respectively. 
Note  that 


\Mi  +  M2|p  — 
< 
< 


||Mi  +  M2  —  lcollp 
||Mi  +  A f2  —  l(ci  +  c2)||p 
\\M\  —  lci||p  +  ||M2  —  leal  |p 
\Mi\p  +  \M2\p 


Thus  the  triangle  inequality  holds.  These  properties  mean 
that  |  •  |p  is  a  semi-norm.  \  ■  |p  behaves  much  like  a  norm. 
For  example,  if  A  is  a  submatrix  of  M,  then  \N\p  <  \M\p. 
However  |  -  |p  is  not  a  norm  because  |M|p  =  0  does  not 
imply  M  =  0;  rather  it  implies  that  M  =  lc  for  some  row 
vector  c  which  minimizes  \\M  —  lc||p.  For  our  purposes,  |  ■  \v 
has  a  particularly  important  property: 
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Lemma  1:  Suppose  AM  is  a  subset  of  II  I"  x  71  such  that 
Ml  =  1  for  all  M  G  A/I.  Then 

\M2Mi\p  <  iMalpIMilp  (5) 


Proof  of  Lemma  1:  Let  co,  ci,  and  C2  denote  values 
of  c  which  minimize  \\M2Mi  —  lc||p,  || M\  —  lc||p,  and 
\\M2  —  lc\\p  respectively.  Then 


M2Mi\p  —  \\M2Mi  —  lco||p 

<  \\M2Mx  -  1(c2M1  +Ci  -  c2lci)||p 

=  \\M2Mi  -  1c2Mi  -  M21ci  +  lc2lci)||p 
=  \\(M2  —  lc2)(ilT1  —  lcj.)!  |p 

<  ||(M2-1c2)||p||(M1-1c1)||p 

=  IMalpIMrlp 


Thus  (5)  is  true.  ■ 

We  say  that  M  G  lRnxrl  is  semi-contractive  in  the  p- 
norm  if  \M\p  <  1.  In  view  of  Lemma  1,  the  product  of 
semi-contractive  matrices  in  AM  is  thus  semi-contractive.  The 
importance  of  these  ideas  lies  in  the  following  fact. 

Proposition  1:  Suppose  AM  is  a  subset  of  IRr'xn  such  that 
M  1  =  1  for  all  M  G  M.  Let  p  be  fixed  and  let  AM  be  a 
compact  set  of  semi-contractive  matrices  in  AM.  Let 


A  =  sup \M\P 
M 

Then  for  each  infinite  sequence  of  matrices  Mi  G  A4,  i  G 
{1,  2, . . .},  the  matrix  product  MiMi-i  ■  ■  ■  Mi  converges  as 
*  -4  oo  as  fast  as  A*  converges  to  zero,  to  a  rank  one  matrix 
of  the  form  lc. 


A.  The  case  p  =  1 

We  now  consider  in  more  detail  the  case  when  p  =  1.  For 
this  case  it  is  possible  to  derive  an  explicit  formula  for  the 
semi-norm  \M\i  of  a  nonnegative  matrix  M  G  IRnxn. 

Proposition  2:  Let  q  be  the  unique  integer  quotient  of  n 
divided  by  2.  Let  M  G  IRnxn  be  a  nonnegative  matrix.  Then 


\M\i 


max 
je{  1,2,..., 


E 

.  *e£,- 


mij  -  E 

te  Si 


where  Cj  and  Sj  are  respectively  the  row  indices  of  the  q 
largest  and  q  smallest  entries  in  the  y  th  column  of  M. 

Consider  now  the  case  when  M  is  a  doubly  stochastic 
matrix  S,  more  can  be  said: 

Theorem  3:  Let  q  be  the  unique  integer  quotient  of  n 
divided  by  2.  Let  S  G  lll"xn  be  a  doubly  stochastic  matrix. 
Then  |Sj  <  1.  Moreover  S'  is  a  semi-contraction  in  the  one- 
norm  if  and  only  if  the  number  of  nonzero  entries  in  each 
column  of  S  exceeds  q. 

Note  that  the  doubly  stochastic  matrix 
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has  a  strongly  connected  graph  but  is  not  a  semi-contractions 
for  p  =  1.  Thus  this  particular  semi-norm  is  not  as  useful  as 
we  would  like  because  there  are  matrices  in  W  which  are 
not  semi-contractions  for  p  =  1. 

It  is  possible  to  compare  this  semi-norm  with  the  coeffi¬ 
cient  of  ergodicity.  Observe  that  while  the  preceding  matrix 
is  not  a  semi-contraction  it  is  a  scrambling  matrix.  Thus  for 
this  example,  r(5l)  <  |5|i  =  1.  On  the  other  hand  there  are 
also  doubly  stochastic  matrices  which  are  semi-contractions 
but  which  are  not  scrambling  matrices.  An  example  of  this 
is  the  matrix 
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Thus  for  this  example,  l^li  <  t(S)  =  1,  which  means  that 
there  are  situations  when  it  may  be  more  advantageous  to 
use  the  semi-norm  |  •  |i  to  compute  convergence  rates  than 
to  appeal  to  the  coefficient  of  ergodicity. 


B.  The  case  p  oc 

Note  that  in  this  case  |S'|00  <  1  for  any  stochastic  matrix 
because  ISjoo  <  ||S'||oo  =  1-  Although  not  at  all  obvious, 
it  turns  out  that  |5joo  equals  the  coefficient  of  ergodicity 
discussed  earlier  and  defined  by  (2).  This  is  an  immedi¬ 
ate  consequence  of  Proposition  3  which  is  stated  below. 
Unfortunately,  the  last  example  in  the  preceding  subsection 
shows  that  there  are  doubly  stochastic  matrices  with  strongly 
connected  graphs  which  are  not  scrambling  matrices.  Thus 
this  particular  semi-norm  is  also  not  as  useful  as  we  might 
hope  for. 

Proposition  3:  Let  A  G  !Rnxn  be  a  nonnegative  matrix. 
Then 

1  n 

|^4|oo  =  /  max  V  | aik  -  ajk\ 

A  i,j  * — ' 


C.  The  case  p  =  2 

For  the  case  when  p  =  2  it  is  also  possible  to  derive  an 
explicit  formula  for  the  semi-norm  \M\2  of  a  nonnegative 
matrix  M  G  IR”xra.  Towards  this  end  note  that  for  any  x  G 
Hn,  the  function  g(x,c)  =  x'(M  —  1  c)'(M  —  lc)a:  attains 
its  minimum  with  respect  to  c  at  A  I'M.  This  implies  that 

\M\2  =  \\PM\\2  =  \J  p{M'  P’  PM} 

where  P  =  I— -11'  and,  for  any  symmetric  matrix  T,  // { 7 ’ } 
is  the  largest  eigenvalue  of  T.  We  are  led  to  the  following 
result. 

Proposition  4:  Let  M  G  III" x"  be  a  nonnegative  matrix. 
Then  |M|2  is  the  largest  singular  value  of  the  matrix  PM 
where  P  is  the  orthogonal  projection  on  the  orthogonal 
complement  of  the  span  of  1. 
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Now  suppose  that  M  is  a  doubly  stochastic  matrix  S.  Then 
S'S  is  also  doubly  stochastic  and  l'S  =  1'.  The  latter  and 
Proposition  4  imply  that 

|S|a  =  (6) 

More  can  be  said: 

Lemma  2:  If  S  is  doubly  stochastic,  then  p{S'S  —  -11/} 
is  the  second  largest  eigenvalue  of  S'S. 

Proof  of  Lemma  2:  Since  S'S  is  symmetric  it  has 
orthogonal  eigenvectors  one  of  which  is  1 .  Let  1 ,  X2 ,  •  •  • ,  xn 
be  such  a  set  of  eigenvectors  with  eigenvalues  1,  A2, . . . ,  A„. 
Then  S'S  1  =  1  and  S'Sxi  =  A  jXj,  i  G  {2,3, 

Clearly  (S'S—  jll')l  =  0  and  (S'S—  ^ll')xi  =  \,Xi.  i  G 
{2,3,...,  n}.  Since  1  is  the  largest  eigenvalue  of  S'S  it  must 
therefore  be  true  that  the  second  largest  eigenvalue  S'S  is 
the  largest  eigenvalue  of  S'S  —  {  11'.  ■ 

We  summarize: 

Theorem  4:  For  p  =  2  the  semi-norm  of  a  doubly  stochas¬ 
tic  matrix  S  is  the  second  largest  singular  value  of  S. 

There  is  another  way  to  think  about  what  this  theorem 
implies.  Prompted  by  the  work  in  [9]  and  [11],  suppose  one 
wants  to  measure  in  the  sense  of  a  2-norm  ||  •  ||,  how  much 
closer  an  71-vector  x  gets  to  the  average  vector  2  =  -11' x 
when  it  is  multiplied  by  a  doubly  stochastic  matrix  S.  In 
other  words  how  does  the  norm  \Sx  —  z ||  compare  with 
\\x  —  z||?  To  address  this  question,  note  first  that  x  —  z  €  O 
where  O  is  the  orthogonal  complement  of  the  span  of  1. 
Note  next  that 

||Sx  —  z||2  =  ||S(a;  —  ~)||2<  ( sup  IN  ^  ~||2 

\veo  y'y  J 

But  supyg0  y  is  the  second  largest  eigenvalue  of  S'S 
which  in  turn  is  the  square  of  the  second  largest  singular 
value  of  S.  In  other  words,  US's;  —  z\\  <  |S|2|N  —  z\\.  Thus 
S x  is  always  as  close  to  the  average  vector  as  x  is  and  is 
even  closer  if  |  ,S  1 2  is  a  contraction. 

In  the  light  of  Theorem  4,  we  are  now  in  a  position  to 
characterize  in  graph  theoretic  terms  those  doubly  stochastic 
matrices  with  positive  diagonal  entries  which  are  semi¬ 
contractions  for  p  =  2. 

Theorem  5:  Let  S  be  a  doubly  stochastic  matrix  with 
positive  diagonal  entries.  Then  |S|2  <  1.  Moreover  S'  is  a 
semi-contraction  in  the  2-norm  if  and  only  if  the  graph  of  S 
is  strongly  connected. 

To  prove  this  theorem  we  need  several  concepts  and 
results.  Let  G  denote  a  directed  graph  and  write  G'  for  that 
graph  which  results  when  the  arcs  in  G  are  reversed;  i.e., 
the  dual  graph.  Call  a  graph  symmetric  if  it  is  equal  to  its 
dual.  Note  that  in  the  case  of  a  symmetric  graph,  the  three 
properties  of  being  rooted,  strongly  connected,  and  weakly 
connected  are  equivalent.  Note  also  that  if  G  is  the  graph  of 
a  nonnegative  matrix  M  with  positive  diagonal  entries,  then 
G'  is  the  graph  of  M'  and  G'  o  G  is  the  graph  of  M'M. 

Lemma  3:  A  directed  graph  G  with  self-arcs  at  all  vertices 
is  weakly  connected  if  and  only  if  G'  o  G  is  strongly 
connected. 


Lemma  4:  Let  T  be  a  stochastic  matrix  with  positive 
diagonal  entries.  If  T  has  a  strongly  connected  graph,  then 
the  magnitude  of  its  second  largest  eigenvalue  is  less  than 
one.  If,  on  the  other  hand,  the  magnitude  of  the  second  largest 
eigenvalue  of  T  is  less  than  one,  then  the  graph  of  T  is 
weakly  connected. 

Lemma  5:  The  graph  G  of  a  doubly  stochastic  matrix  D 
is  strongly  connected  if  and  only  if  it  is  weakly  connected.1 

The  proof  of  Lemma  5  which  follows  is  based  on  ideas 
from  [15]  and  [27].  Let  G  be  a  directed  graph  with  vertex 
set  V  =  {1,  2, ... ,  n}.  Call  a  vertex  j  is  reachable  from  i  if 
either  j  =  i  or  if  there  is  a  directed  path  from  i  to  j.  Call 
a  vertex  i  essential  if  i  is  reachable  from  all  vertices  which 
are  reachable  from  i. 

Lemma  6:  Every  directed  graph  has  at  least  one  essential 
vertex. 

To  proceed,  let  us  say  that  vertices  i  and  j  are  mutu¬ 
ally  reachable  if  each  is  reachable  from  the  other.  Mutual 
reachability  is  clearly  an  equivalence  relation  on  V  which 
partitions  V  into  the  disjoint  union  of  a  finite  number  of 
equivalence  classes.  Note  that  if  i  is  an  essential  vertex  of 
G,  then  every  vertex  in  the  equivalence  class  of  i  is  also 
essential.  Thus  every  directed  graph  possesses  at  least  one 
mutually  reachable  equivalence  class  whose  members  are  all 
essential. 

Proof  of  Lemma  5:  Strong  connectivity  clearly  implies 
weak  connectivity.  We  prove  the  converse.  Suppose  G  is 
weakly  connected.  In  view  of  the  proceeding,  G  has  at 
least  one  mutually  reachable  equivalence  class  £  whose 
members  are  all  essential.  If  £  =  V,  then  G  is  obviously 
strongly  connected.  Thus  to  prove  the  lemma,  it  is  enough 
to  show  that  £  =  V.  Suppose  the  contrary,  namely  that  £  = 
{*i,  *2, .  • . ,  im}  is  a  strictly  proper  subset  of  V.  Let  it  be  any 
permutation  map  for  which  n(ij)  =  j,  j  G  {1,2,  ...,ro} 
and  let  P  be  the  corresponding  permutation  matrix.  Then 
clearly 


and  P' DP  is  doubly  stochastic.  Since  P'DP  is  doubly 
stochastic,  the  column  sums  of  A  must  all  equal  one  as 
must  the  row  sums  of  the  submatrix  [A  B],  But  the 
transformation  D  1 — >  P' DP  corresponds  to  a  relabeling 
of  the  vertices  of  G,  so  the  graph  of  P'DP  must  also  be 
weakly  connected.  Thus  means  that  B  cannot  be  the  zero 
matrix.  Therefore  the  sum  of  the  row  sums  of  A  must  be 
less  than  m.  But  this  contradicts  the  fact  that  the  sum  of  the 
column  sums  of  A  equals  m.  Therefore  £  =  V.  I 

Proof  of  Theorem  5:  Let  S  be  a  doubly  stochastic  matrix 
with  positive  diagonal  entries.  Then  1  is  the  largest  singular 
value  of  S  because  S' S  is  doubly  stochastic.  From  this  and 
Theorem  4  it  follows  that  S  |  2  <  1. 

Suppose  S'  is  a  semi-contraction.  Then  in  view  of  Theorem 
4,  the  second  largest  eigenvalue  of  S'  S  is  less  than  1.  Thus 
by  Lemma  4,  the  graph  of  S'S  is  weakly  connected.  But 

1  It  is  clear  that  strong  connectivity  of  G  implies  weak  connectivity  of  G. 
The  converse  was  conjectured  by  John  Tsitsiklis  in  a  private  communication. 


1978 


S'  S  is  symmetric  so  its  graph  must  be  strongly  connected. 
Therefore  by  Lemma  3,  the  graph  of  S  is  weakly  connected. 
In  view  of  Lemma  5,  the  graph  of  S  is  strongly  connected. 

Now  suppose  that  the  graph  of  S  is  strongly  connected. 
Then  S  is  weakly  connected  so  the  graph  of  S'S  is  strongly 
connected  because  of  Lemma  3.  Thus  by  Lemma  4,  the 
magnitude  of  the  second  largest  eigenvalue  of  S'S  is  less 
than  1.  From  this  and  Theorem  4  it  follows  that  S'  is  a  semi¬ 
contraction.  ■ 

Proof  of  Theorem  2:  Let  M  be  any  compact  subset  of 
W.  In  view  of  Theorem  5,  each  matrix  in  A4  is  a  semi¬ 
contraction  in  the  two-norm.  From  this  and  Proposition  1,  it 
follows  that  M.  is  convergable. 

Now  suppose  that  M  is  convergable  and  let  S  be  a  matrix 
in  M.  Then  S’  converges  to  a  matrix  of  the  form  lc  as 
i  — >  oo.  This  means  that  the  second  largest  eigenvalue  of  S 
must  be  less  than  1  in  magnitude.  Thus  by  Lemma  4,  S  must 
have  a  weakly  connected  graph.  By  Lemma  5,  the  graph  of 
S  must  be  strongly  connected.  ■ 

The  importance  of  Theorem  5  lies  in  the  fact  that  the  matri¬ 
ces  in  every  convergable  set  of  doubly  stochastic  matrices  are 
contractions  in  the  2-norm.  In  view  of  Proposition  1 ,  this  en¬ 
ables  one  to  immediately  compute  a  rate  of  convergence  for 
any  infinite  product  of  matrices  from  any  given  convergable 
set.  The  coefficient  of  ergodicity  mentioned  earlier  does  not 
have  this  property.  If  it  did,  then  every  doubly  stochastic 
matrix  with  a  strongly  connected  graph  would  have  to  be 
a  scrambling  matrix.  The  following  counterexample  shows 
that  this  is  not  the  case: 

-  .5  .25  0  0  0  .25  - 

.25  .5  0  0  0  .25 

0  0  .5  .5  0  0 

*  ~  0  0  .5  .25  0  .25 

0  0  0  0  .875  .125 

..25  .25  0  .25  .125  .125. 

In  particular,  S'  is  a  doubly  stochastic  matrix  with  a  strongly 
connected  graph  but  it  is  not  a  scrambling  matrix. 

IV.  Concluding  Remarks 

In  this  paper  we  have  identified  the  largest  “alphabets” 
of  stochastic  and  doubly  stochastic  matrices  with  positive 
diagonal  entries  whose  “words”  converge  exponentially  fast 
as  word  length  increases.  In  the  case  of  double  stochastic 
matrices,  each  matrix  in  the  corresponding  alphabet  is  shown 
to  be  a  semi-contraction  in  the  two-norm.  In  the  case  of 
stochastic  matrices  which  are  not  doubly  stochastic,  we  were 
not  similarly  successful  and  the  problem  of  discovering  a 
suitable  semi-contraction  for  this  case  remains  unresolved. 
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Abstract —  By  the  distributed  averaging  problem  is  meant  the 
problem  of  computing  the  average  value  of  a  set  of  numbers 
possessed  by  the  agents  in  a  distributed  network  using  only 
communication  between  neighboring  agents.  Gossiping  is  a 
well-known  approach  to  the  problem  which  seeks  to  iteratively 
arrive  at  a  solution  by  allowing  each  agent  to  interchange 
information  with  at  most  one  neighbor  at  each  iterative  step. 
Crafting  a  gossiping  protocol  which  accomplishes  this  is  chal¬ 
lenging  because  gossiping  is  an  inherently  collaborative  process 
which  can  lead  to  deadlock  unless  careful  precautions  are 
taken  to  ensure  that  it  does  not.  In  this  paper  we  present 
three  gossiping  protocols.  We  show  by  example  that  the  first 
can  deadlock.  While  the  second  cannot,  it  requires  a  degree 
of  network-wide  coordination  which  may  not  be  possible  to 
secure  in  some  applications.  The  third  protocol  uses  only  local 
information,  is  guaranteed  to  avoid  deadlock,  and  requires 
fewer  transmissions  per  iteration  than  standard  broadcast- 
based  distributed  averaging  protocols. 

I.  Introduction 

There  has  been  considerable  interest  recently  in  developing 
algorithms  for  distributing  information  among  the  members 
of  a  group  of  sensors  or  mobile  autonomous  agents  via 
local  interactions.  Notable  among  these  are  those  algorithms 
intended  to  cause  such  a  group  to  reach  a  consensus  in  a 
distributed  manner  [1]— [6].  One  particular  type  of  consensus 
processes  which  has  received  much  attention  lately  is  called 
distributed  averaging  [7],  In  its  simplest  form,  distributed 
averaging  deals  with  a  network  of  n  >  1  agents  and  the 
constraint  that  each  agent  i  is  able  to  communicate  only 
with  certain  other  agents  called  agent  i’s  neighbors.  Neighbor 
relations  are  described  by  a  simple,  connected  graph  A 
in  which  vertices  correspond  to  agents  and  edges  indicate 
neighbor  relations.  Initially,  each  agent  has  or  acquires  a 
real  number  yi  which  might  be  a  measured  temperature  or 
something  similar.  The  distributed  averaging  problem  is  to 
devise  a  protocol  which  will  enable  each  agent  to  compute 
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the  average  yavg  =  ^  Y"=1  'Sh  using  onlY  information 
acquired  from  its  neighbors.  There  are  many  variants  of  this 
problem.  For  example,  instead  of  real  numbers,  the  yi  may  be 
integer- valued  [8].  Another  variant  assumes  that  the  edges  of 
A  change  over  time  [9].  This  paper  considers  the  case  when 
the  yi  are  real  and  A  does  not  depend  on  time. 

As  noted  in  [7],  the  distributed  averaging  problem  can  be 
solved,  in  principle,  by  “flooding”;  that  is,  by  propagating 
across  the  network  over  time  the  values  of  all  of  the  yi. 
Armed  with  knowledge  of  all  of  these  values,  each  agent  is 
thus  able  to  compute  yav g.  A  more  sophisticated  approach  to 
the  problem  is  for  each  agent  to  use  a  linear  iterative  update 
rule  of  the  general  form 

Xi(t  +  1)  =  WiiXiit)  +  ^2  WijXjit ),  Xi(0)  =  yi 

jeAfi 

where  t  is  a  discrete  time  index,  Xi(t)  is  agent  i’s  current 
estimate  of  yav g,  the  Wij  are  real- valued  weights,  and  A/] 
is  the  set  of  labels  of  the  neighbors  of  agent  i.  In  [7] 
several  methods  are  proposed  for  choosing  the  w^.  One 
particular  choice,  which  defines  what  has  come  to  be  known 
as  the  Metropolis  algorithm,  requires  only  local  information 
to  define  the  w^.  Algorithms  of  this  type,  which  require 
each  agent  to  communicate  with  all  of  its  neighbors  on  each 
iteration,  are  sometimes  called  broadcast  algorithms. 

An  alternative  approach  to  distributed  averaging,  which 
typically  does  not  involve  broadcasting,  exploits  a  form 
of  “gossiping”  [10]  specifically  tailored  to  the  distributed 
averaging  problem.  The  idea  of  gossiping  is  very  simple.  A 
pair  of  neighbors  with  labels  i  and  j  are  said  to  gossip  at 
time  t  if  both  Xi(t  +  1)  and  Xj(t  +  1)  are  set  equal  to  the 
average  of  Xi(t)  and  Xj{t).  Each  agent  is  allowed  to  gossip 
with  at  most  one  neighbor  at  one  time.  Under  appropriate 
assumptions,  algorithms  which  possess  this  simple  property 
can  be  shown  to  solve  the  distributed  averaging  problem. 
Gossiping  algorithms  do  not  necessarily  involve  broadcasting 
and  thus  have  the  potential  to  require  less  transmissions  per 
iteration  than  broadcast  algorithms.  Of  course  one  would  not 
expect  gossip  algorithms  to  converge  as  fast  as  broadcast 
algorithms. 

The  actual  sequence  of  gossip  pairs  which  occurs  during 
a  specific  gossip  process  might  be  determined  either  proba¬ 
bilistically  [10]  or  deterministically  [11],  [12],  depending  on 
the  problem  of  interest.  Deterministic  gossiping  protocols 
are  intended  to  guarantee  that  under  all  conditions,  a  con¬ 
sensus  will  be  achieved  asymptotically  whereas  probabilistic 
protocols  aim  at  achieving  consensus  asymptotically  with 
probability  one.  Both  approaches  have  merit.  The  proba¬ 
bilistic  approach  is  typically  somewhat  easier  both  in  terms 
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of  algorithm  development  and  convergence  analysis.  On  the 
other  hand,  the  deterministic  approach  forces  one  to  con¬ 
sider  worst  case  scenarios  and  has  the  potential  of  yielding 
algorithms  which  may  outperform  those  obtained  using  the 
probabilistic  approach.  The  aim  of  this  paper  is  to  present 
deterministic  gossiping  which  do  not  utilize  broadcasting 
and  which  generate  sequences  x(0),  a;(l),  x(2), . . .  which  are 
guaranteed  to  converge  exponentially  fast  to  the  limit  vector 
which  solves  the  distributed  averaging  problem. 

II.  Gossiping 

The  type  of  gossiping  we  want  to  consider  involves  a 
group  of  n  >  1  agents  labeled  1  to  n.  Each  agent  i  has 
control  over  a  real-valued  scalar  quantity  Xi  called  a  gossip 
variable  which  the  agent  is  able  to  update.  A  gossip  between 
agents  i  and  j,  written  occurs  at  time  t  if  the  values 

of  both  agents’  variables  at  time  t  +  1  equal  the  average 
of  their  values  at  time  t.  In  other  words  Xi(t  +  1)  = 
Xj(t  +  1)  =  \{xi(t)  +  Xj(t)).  If  agent  i  does  not  gossip 
at  time  t,  its  gossip  variable  does  not  change;  thus  in  this 
case  Xi(t  +  1)  =  Xi(t).  Generally  not  every  pair  of  agents 
is  allowed  to  gossip.  The  edges  of  A  specify  which  gossip 
pairs  are  allowable.  In  other  words  a  gossip  between  agents 
i  and  j  is  allowable  if  (i,j)  is  an  edge  in  A.  We  sometimes 
call  A  an  allowable  gossip  graph.  Although  in  this  paper  we 
shall  be  interested  primarily  in  gossiping  protocols  which 
stipulate  that  each  agent  is  allowed  to  gossip  with  at  most 
one  of  its  neighbors  at  one  time,  as  we  shall  see  later, 
there  is  value  in  taking  the  time  here  to  generalize  the  idea. 
Let  us  agree  to  call  a  subset  £  of  to  >  1  agent  labels,  a 
neighborhood  if  each  pair  of  distinct  labels  in  £  are  the 
labels  of  vertices  in  A  which  are  connected.  We  say  that  the 
agents  with  labels  in  £  perform  a  gossip  of  order  m  at  time 
t  if  each  updates  its  gossip  variable  to  the  average  of  all; 
that  is,  if  Xi(t  +  1)  =  A  J2j&c  Xj(t),  i  £  £.  A  generalized 
gossip  is  a  gossip  of  any  order.  A  gossip  without  the  modifier 
“generalized”,  will  continue  to  mean  a  gossip  of  order  2. 

One  rule  which  sharply  distinguishes  a  gossiping  process 
from  a  more  distributed  averaging  process  is  that  in  the  case 
of  gossiping,  each  agent  is  allowed  to  gossip  with  at  most 
one  of  its  neighbors  at  one  time.  This  rule  does  not  preclude 
the  possibility  of  two  or  more  pairs  of  agents  gossiping 
at  the  same  time,  provided  each  of  the  two  pairs  have  no 
agent  in  common.  More  precisely,  two  gossip  pairs  (i .  j ) 
and  (k,  m)  are  noninteracting  if  neither  i  nor  j  equals  either 
k  or  to.  When  multiple  noninteracting  pairs  of  allowable 
gossips  occur  simultaneously,  the  simultaneous  occurrence 
of  all  such  gossips  is  called  a  multi-gossip.  In  other  words  a 
multi-gossip  at  time  t  is  the  set  of  all  gossips  which  occur  at 
time  t,  with  the  understanding  that  each  such  pair  is  allowable 
and  that  any  two  such  pairs  are  noninteracting.  A  generalized 
multi-gossip  at  time  t  is  a  finite  set  of  generalized  gossips 
with  disjoint  neighborhoods  which  occur  simultaneously  at 
time  t. 

A  gossiping  process  can  often  be  modeled  by  a  discrete 
time  linear  system  of  the  form  x(t  +  1)  =  M(t)x(t),  t  = 
0,1,2,...  where  x  £  IR”  is  a  state  vector  of  gossiping 


variables  and  M(t)  is  a  matrix  characterizing  how  x  changes 
as  the  result  of  the  gossips  which  take  place  at  time  t ;  some¬ 
times  M  (t  )  depends  on  x  although  the  notational  dependence 
is  often  suppressed.  If  a  single  pair  of  distinct  agents  i  and 
j  gossip  at  time  t  >  0,  then  M(t)  =  Pij  where  P^  is 
the  n  x  n  matrix  for  which  pu  =  =  pji  =  pjj  = 

Pkk  =  l,k  0  {i,j},  and  all  remaining  entries  equal  zero. 
We  call  such  P,;  single  gossip  primitive  gossip  matrices.  If 
at  time  t  a  multi-gossip  occurs,  then  as  a  consequence  of  non¬ 
interaction,  M(t)  is  simply  the  product  of  the  single  gossip 
primitive  gossip  matrices  corresponding  to  the  individual 
gossips  comprising  the  multi-gossip;  moreover  because  of 
non-interaction,  the  primitive  gossip  matrices  in  the  product 
commute  with  each  other  and  so  any  given  permutation  of  the 
primitive  matrices  in  the  product  determines  the  same  matrix 
P.  We  refer  to  P  as  the  primitive  gossip  matrix  determined 
by  the  multi-gossip  under  consideration. 

The  idea  of  a  primitive  gossip  matrix  extends  naturally  to 
generalized  gossips.  In  particular,  we  associate  with  a  neigh¬ 
borhood  £  the  nxn  matrix  Pc  where  p3k  =  >  j,  k  £  £, 

Pjj  =  1,  j  0  £,  and  Os  elsewhere.  We  call  Pc  the  primitive 
gossip  matrix  determined  by  £.  By  the  graph  induced  by 
Pc,  written  G,c,  we  mean  the  spanning  subgraph  of  A  whose 
edge  set  is  all  edges  in  A  which  are  incident  on  vertices  with 
labels  which  are  both  in  £.  More  generally,  if  £i,  £2,  -  -  ■ ,  £fc 
are  k  disjoint  neighborhoods,  the  matrix  Pc1Pc2  ' 1  •  Pck  is 
the  primitive  gossip  matrix  determined  by  £1 ,  £2 ,  -  - . ,  £fc 
and  the  graph  induced  by  I’c,  Pc2  •  ■  ■  Pck  is  the  union  of 
the  induced  graphs  G^,  i  £  {1,2,...,  A}.  Note  that  the 
matrices  in  the  product  Pc3Pc2  ' '  ■  PCk  commute  because 
the  £,  are  disjoint  so  the  order  of  the  matrices  in  the  product 
is  not  important  for  the  definition  to  make  sense.  Note  also 
that  there  are  only  finitely  many  primitive  gossip  matrices 
associated  with  A. 

A.  Gossiping  Sequences 

Let  71 , 72 , ...  be  an  infinite  sequence  of  multi-gossips  cor¬ 
responding  to  some  or  all  of  the  edges  in  A.  Corresponding 
to  such  a  sequence  is  a  sequence  of  primitive  gossip  matrices 
Q  t .  Qi , . . .  where  (f,  is  the  primitive  gossip  matrices  of  the 
ith  multi-gossip  in  the  sequence.  For  given  :/;((]),  such  a 
gossiping  matrix  sequence  generates  the  sequence  of  vectors 

x{t)  =  QtQt-i  ■  ■  ■  Qix(0),  t>  0  (1) 

which  we  call  a  gossiping  sequence.  We  have  purposely 
restricted  this  definition  of  a  gossiping  sequence  to  multi¬ 
gossip  sequences,  as  opposed  to  generalized  multi-gossip 
sequences,  since  we  will  only  be  dealing  with  algorithms 
involving  multi-gossips.  Our  reason  for  considering  general¬ 
ized  multi-gossips  will  become  clear  in  a  moment. 

As  will  soon  be  obvious,  the  matrices  Qi  in  (1)  are  not 
necessarily  the  only  primitive  gossip  matrices  for  which 
(1)  holds.  This  non-uniqueness  can  play  a  crucial  rule  in 
understanding  certain  gossip  protocols  which  are  not  linear 
iterations.  To  understand  why  this  is  so,  let  us  agree  to  say 
that  the  transition  x(t)  i — »  a;(r+l)  contains  a  virtual  gossip 
if  there  is  a  neighborhood  £  for  which  a ;,(r)  =  Xj{r ),  i,j  £ 
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C.  We  say  that  agent  i  has  gossiped  virtually  with  agent  j  at 
time  t  if  i  and  j  are  both  labels  in  C.  Thus  while  we  are  only 
interested  in  algorithms  in  which  an  agent  may  gossip  with 
at  most  one  neighbor  at  any  one  time,  for  such  algorithms 
there  may  be  times  at  which  virtual  gossips  occur  between 
an  agent  and  one  or  more  of  its  neighbors.  Suppose  that  for 
some  time  r  <  t,  the  transition  x(t )  i — »  x(t  +  1)  contains 
such  a  virtual  gossip  and  let  Pc  denote  the  primitive  gossip 
matrix  determined  by  C.  Then  clearly  Pcx(t)  =  x(t)  which 
means  that  the  matrix  QT+ i  in  the  product  QtQt-i  •  •  ■  Q\ 
can  be  replaced  by  the  matrix  Qt+iPc  without  changing  the 
validity  of  (1).  Moreover  QT+  \  Pc  will  be  a  primitive  gossip 
matrix  if  the  neighborhoods  which  define  QT+  \  are  disjoint 
with  C.  The  importance  of  this  elementary  observation  is 
simply  this.  Without  taking  into  account  virtual  gossips  in 
equations  such  as  (1),  it  may  in  some  cases  to  be  impossible 
to  conclude  that  the  matrix  product  QtQt-i  •  ■  •  Qi  converges 
as  t  — >  oo  even  though  the  gossip  sequence  x(l),x(2), . . . 
does.  Later  in  this  paper  we  will  describe  a  gossip  protocol 
for  which  this  is  true. 

Prompted  by  the  preceding,  let  us  agree  to  say  that 
a  gossiping  sequence  satisfying  (1)  is  consistent  with  a 
sequence  of  primitive  gossip  matrices  Pi,  P2, . . .  if 

x(t)  =  PtPt-i  ■  ■  ■  P\x(Q),  t  >  0 

It  is  obvious  that  if  the  sequence  x(t),  t  >  0  is  consistent 
with  the  sequence  P1)P2j  ■  •  •  and  the  latter  converges,  then 
so  does  the  former.  Given  a  gossip  vector  sequence,  our 
task  then  is  to  find,  if  possible,  a  consistent,  primitive  gossip 
matrix  sequence  which  is  also  convergent. 

As  we  have  already  noted,  A  has  associated  with  it  a  finite 
family  of  primitive  gossip  matrices  and  each  primitive  gossip 
matrix  induces  a  spanning  subgraph  of  A.  It  follows  that 
any  finite  sequence  of  primitive  gossip  matrix  Pi ,  P2 , . . . ,  Pk 
induces  a  spanning  subgraph  of  A  whose  edge  set  is  the 
union  of  the  edge  sets  of  the  graphs  induced  by  all  of 
the  Pi.  We  say  that  the  primitive  gossip  matrix  sequence 
Pi ,  P2 , . . . ,  Pfc  is  complete  if  the  graph  the  sequence  induces 
is  a  connected  spanning  subgraph  of  A.  An  infinite  se¬ 
quence  of  primitive  gossip  matrices  Pi,  P2, ...  is  repetitively 
complete  with  period  T,  if  each  successive  subsequence  of 
length  T  in  the  sequence  is  complete.  A  gossiping  sequence 
x(t),  t  >  0  is  repetitively  complete  with  period  T,  if  there 
is  a  consistent  sequence  of  primitive  gossip  matrices  which 
is  repetitively  complete  with  period  T.  The  importance  of 
repetitive  completeness  is  as  follows. 

Theorem  1:  Suppose  Pi,  P2, . . .  is  an  infinite  sequence  of 
primitive  gossip  matrices  which  is  repetitively  complete  with 
period  T.  There  exists  a  real  nonnegative  number  A  <  1, 
depending  only  on  T  and  the  Pt,  for  which 

lim  PtPt_i  ■  ■  •  Pix(O)  =  ?/aVgl 
£->-00 

as  fast  as  A4  converges  to  zero. 

There  are  several  different  ways  to  prove  this  theorem 
using  ideas  from  [2],  [5],  [6],  [13],  [14].  A  proof  of  this 
theorem  will  be  given  in  the  full  length  version  of  this  paper. 


III.  Request-Based  Gossiping 

Request-based  gossiping  is  a  gossiping  process  in  which  a 
gossip  occurs  between  two  agents  whenever  one  of  the  two 
accepts  a  request  to  gossip  placed  by  the  other.  The  aim  of 
this  section  is  to  discuss  this  process. 

In  a  request-based  gossiping  process,  a  given  agent  i  may 
gossip  with  one  of  its  neighbors  at  time  t  only  if  t  is  either 
an  “event  time”  of  agent  i  or  an  “event  time”  of  the  neighbor 
which  has  made  a  request  to  gossip  with  agent  i.  By  an  event 
time  of  agent  i  is  meant  a  time  at  which  agent  i  may  place  a 
request  to  gossip  with  one  of  its  neighbors.  By  an  event  time 
interval  of  agent  i  is  meant  the  interval  of  time  between  two 
successive  event  times  of  agent  i.  For  obvious  reasons,  we 
assume  that  the  lengths  of  agent  i’s  event  time  intervals  are 
all  bounded  above  by  a  finite  positive  number  T) .  We  write 
Ti  for  the  set  of  event  times  of  agent  i  and  T  for  the  union 
of  the  event  time  sequences  of  all  n  agents. 

Conflicts  leading  to  deadlocks  can  arise  if  an  agent  who 
has  placed  a  request  to  gossip,  at  the  same  time  receives 
a  request  to  gossip  from  another  agent.  It  is  challenging 
to  devise  rules  which  resolve  such  conflicts  while  at  the 
same  time  ensuring  exponential  convergence  of  the  gossiping 
process.  One  way  to  avoid  such  conflicts  is  to  assign  event 
times  off  line  so  that  no  agent  can  receive  a  request  to  gossip 
at  any  of  its  own  event  times.  There  are  several  ways  to  do 
this  which  will  be  discussed  below. 

From  time  to  time,  agent  i  may  have  more  than  one 
neighbor  to  which  it  might  be  able  to  make  a  request  to 
gossip  with.  Also  from  time  to  time,  agent  i  may  receive 
more  than  one  request  to  gossip.  While  in  such  situations 
decisions  about  who  to  place  requests  with  or  whose  request 
to  accept  can  be  randomized,  in  this  paper  we  will  examine 
only  completely  deterministic  strategies.  To  do  this  we  will 
assume  that  each  agent  i  has  ordered  its  neighbors  in  Af+ 
according  to  some  priorities  so  when  a  choice  occurs  between 
neighbors,  agent  i  will  always  choose  the  one  with  highest 
priority. 

Consider  first  the  situation  when  the  event  times  of  each 
agent  and  each  agent’s  neighbor  priorities  are  chosen  off  line 
and  are  fixed  throughout  the  gossiping  process.  Assume  that 
the  event  times  are  chosen  so  that  no  agent  can  receive  a 
request  to  gossip  at  any  of  its  own  event  times.  Our  aim 
is  to  show  that  this  arrangement  can  be  problematic.  The 
following  protocol  illustrates  this. 

Protocol  I:  At  each  event  time  t  G  T  the  following  rules 
apply  for  each  i  e  {1,2,...,  n}: 

1)  If  t  £  Ti,  agent  i  places  a  request  to  gossip  with  that 
neighbor  whose  priority  is  the  highest. 

2)  If  f  ^  %,  agent  i  does  not  place  a  request  to  gossip. 

3)  Each  agent  i  receiving  one  or  more  requests  to  gos¬ 
sip  must  gossip  with  that  requesting  neighbor  whose 
priority  is  the  highest. 

4)  If  t  qL  %  and  agent  i  does  not  receive  a  request  to 
gossip,  it  does  not  gossip. 

The  following  example  shows  that  this  simple  strategy  will 
not  necessarily  lead  to  a  consensus.  Suppose  that  A  is  a  path 
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graph  with  edges  (a,b),(b,c),(c,d).  Assume  that  agents  a 
and  b  have  distinct  event  times  and  that  agents  a  and  c  have 
the  same  event  times  as  do  agents  b  and  d:  note  that  this 
guarantees  that  no  agent  can  receive  a  request  to  gossip  at 
any  of  its  own  event  times.  To  avoid  ambiguities  in  decision 
making,  suppose  that  agent  b  assigns  a  higher  priority  to  a 
than  to  c  and  agent  c  assigns  a  higher  priority  to  d  than 
to  b.  Let  t  be  an  event  time  of  agents  a  and  c.  Then  at 
this  time  a  places  a  request  to  gossip  with  b  and  c  places 
a  request  to  gossip  with  d.  Since  b  and  d  receive  no  other 
requests,  gossips  take  place  between  a  and  b  and  between  c 
and  d.  Alternatively,  if  t  is  an  event  time  of  agents  b  and  d, 
then  at  this  time,  b  places  a  request  to  gossip  with  a  and  d 
places  a  request  to  gossip  with  c.  Since  a  and  c  receive  no 
other  requests,  gossips  again  take  place  between  a  and  b  and 
between  c  and  d.  Thus  under  no  conditions  is  there  ever  a 
gossip  between  b  and  c,  so  the  gossiping  process  will  never 
reach  a  consensus.  The  reader  may  wish  to  verify  that  simply 
changing  the  priorities  will  not  rectify  this  situation:  For  any 
choice  of  priorities,  there  will  always  be  at  least  one  gossip 
needed  to  reach  a  consensus,  which  will  not  take  place. 

The  preceding  example  illustrates  that  fixed  priorities  can 
present  problems.  The  global  ordering  proposed  in  [11]  is 
one  way  to  overcome  them.  In  what  follows  we  take  an 
alternative  approach. 

In  the  light  of  Theorem  1  it  is  of  interest  to  consider 
gossiping  protocols  which  generate  repetitively  complete 
gossip  sequences.  Towards  this  end,  let  us  agree  to  say  that 
an  agent  i  has  completed  a  round  of  gossiping  after  it  has 
gossiped  with  each  neighbor  in  J\ft  at  least  once.  Thus  the 
finite  sequence  of  primitive  gossiping  matrices  corresponding 
to  a  finite  sequence  of  multi-gossips  for  the  entire  group 
which  has  occurred  over  an  interval  of  length  T,  will  be 
complete  if  over  the  same  period  each  agent  in  the  group 
completes  a  round. 

For  the  protocols  which  follow  it  will  be  necessary  for 
each  agent  i  to  keep  track  of  where  it  is  in  a  particular 
round.  To  do  this,  agent  i  makes  use  of  a  recursively  updated 
neighbor  queue  q i(t)  where  q,(-)  is  a  function  from  T 
to  the  set  of  all  possible  lists  of  the  rii  labels  in  J\ft,  the 
neighbor  set  of  agent  i.  Roughly  speaking,  qi(f)  is  a  list  of 
the  labels  of  the  neighbors  of  agent  i  which  defines  the  queue 
of  neighbors  at  time  t  which  are  in  line  to  gossip  with  agent 
i.  The  updating  of  q i{t)  is  straightforward:  If  neighbor  j 
gossips  with  agent  i  at  time  t,  the  updated  queue  q,;(f  +  l)  is 
obtained  by  moving  agent  j’s  label  from  its  current  position 
in  q i(t),  to  the  end  of  the  queue.  If  on  the  other  hand,  agent 
i  does  not  gossip  at  time  t,  q,(f  +  1)  =  q,;(f). 

A.  Protocols 

As  noted  earlier,  it  is  helpful  to  have  event  time  assign¬ 
ments  which  guarantee  that  no  agent  can  receive  a  request  to 
gossip  at  any  of  its  own  event  times.  One  way  to  accomplish 
this  is  to  use  event  time  assignments  which  satisfy  the 
following  assumption. 

Distinct  neighbor  event  times  assumption:  For  each  i  £ 
{1,2,...,  n}  and  each  j  £  A{,  %  and  are  disjoint  sets. 


Thus  if  this  assumption  holds,  the  event  times  of  each 
agent  are  distinct  from  the  event  times  of  all  of  its  neigh¬ 
bors.  In  all  cases  the  largest  number  of  distinct  event  time 
sequences  which  would  need  to  be  assigned  to  A  to  satisfy 
the  distinct  neighbor  event  times  assumption  is  no  greater 
than  one  plus  the  maximum  vertex  degree  of  A  [15]. 

Under  the  distinct  neighbor  event  times  assumption,  it  is 
possible  to  ensure  exponential  convergence  with  the  follow¬ 
ing  protocol. 

Protocol  II:  Suppose  that  the  distinct  neighbor  event  times 
assumption  holds.  At  each  event  time  t  £  T  the  following 
rules  apply  for  each  i  £  {1,2, ...  ,n}: 

1)  If  t  £  %,  agent  i  places  a  request  to  gossip  with  that 
neighbor  whose  label  is  at  the  front  of  the  queue  q i(t). 

2)  If  f  ^  7 {,  agent  i  does  not  place  a  request  to  gossip. 

3)  Each  agent  i  receiving  one  or  more  requests  to  gossip 
must  gossip  with  that  requesting  neighbor  whose  label 
is  closest  to  the  front  of  the  queue  q i(t). 

4)  If  t  qL  %  and  agent  i  does  not  receive  a  request  to 
gossip,  it  does  not  gossip. 

Proposition  1:  Let  8  denote  the  set  of  all  edges  (i,j)  in 
A.  Suppose  that  the  distinct  neighbor  event  times  assumption 
holds  and  that  all  agents  in  the  group  adhere  to  Protocol 
II.  Then  the  infinite  gossiping  sequence  generated  will  be 
repetitively  complete  with  period 


T  =  max  min 
(i,k)££ 


ieM 


Uj, 


Tk  ^  nj 


The  proof  of  Proposition  1  can  be  found  in  [15]. 

A  disadvantage  of  Protocol  II  is  that  it  requires  the  distinct 
neighbor  event  times  assumption.  This  assumption  can  only 
be  satisfied  by  off-line  assignment  of  event  times  for  each 
agent,  and  in  some  applications  such  an  off-line  assignment 
may  be  undesirable.  In  a  recent  doctoral  thesis  [16],  a 
clever  gossiping  protocol  is  proposed  which  does  not  require 
the  distinct  neighbor  event  times  assumption.  The  protocol 
avoids  deadlocks  and  achieves  consensus  exponentially  fast. 
A  disadvantage  of  the  protocol  in  [16]  is  that  it  requires  each 
agent  to  obtain  the  values  of  all  of  its  neighbors’  gossip 
variables  at  each  clock  time.  By  exploiting  one  of  the  key 
ideas  in  [16]  together  with  the  notion  of  an  agent’s  neighbor 
queue  q i(t)  defined  earlier,  it  is  possible  to  obtain  a  gossiping 
protocol  which  also  avoids  deadlocks  and  achieves  consensus 
exponentially  fast  but  without  requiring  each  agent  to  obtain 
the  values  of  all  of  its  neighbors’  gossip  variables  at  each 
iteration. 

In  the  sequel  we  will  outline  a  gossiping  algorithm  in 
which  at  time  t,  each  agent  i  has  a  single  preferred  neighbor 
whose  label  i*(t)  is  in  the  front  of  queue  q i(t).  At  time 
t  each  agent  i  transmits  to  its  preferred  neighbor  its  label 
i  and  the  current  value  of  its  gossip  variable  xft).  Agent 
i  then  transmits  the  current  value  of  its  gossip  variable  to 
those  agents  which  have  agent  i  as  their  preferred  neighbor; 
these  neighbors  plus  neighbor  i*(t)  are  agent  i’s  receivers 
at  time  t.  They  are  the  neighbors  of  agent  i  who  know 
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the  current  gossip  value  of  agent  i.  Agent  i  is  presumed  to 
have  placed  a  request  to  gossip  with  its  preferred  neighbor 
i*  (t)  if  Xi(t)  >  Xi *(*)(<);  agent  i  is  a  requester  of  agent 
i*  (t)  whenever  this  is  so.  Note  that  while  an  agent  has 
exactly  one  preferred  neighbor,  it  may  at  the  same  time  have 
anywhere  from  zero  to  n,;  requesters,  where  rii  is  the  number 
of  neighbors  of  agent  i. 

Protocol  III:  Between  clock  times  t  and  t  +  1  each 
agent  i  performs  the  steps  enumerated  below  in  the  order 
indicated.  Although  the  agents’  actions  need  not  be  precisely 
synchronized,  it  is  understood  that  for  each  k  G  {1,  2,3}  all 
agents  complete  step  k  before  any  embark  on  step  k  +  1 . 

1)  1st  Transmission:  Agent  i  sends  its  label  i  and  its 
gossip  value  Xi(t)  to  its  current  preferred  neighbor. 
At  the  same  time  agent  i  receives  the  labels  and  cor¬ 
responding  gossip  values  from  all  of  those  neighbors 
which  have  agent  i  as  their  current  preferred  neighbor. 

2)  2nd  Transmission:  Agent  i  sends  its  current  gossip 
value  Xi(t)  to  those  neighbors  which  have  agent  i  as 
their  current  preferred  neighbor. 

3)  Acceptances: 

a)  If  agent  i  has  not  placed  a  request  to  gossip  but 
has  received  at  least  one  request  to  gossip,  then 
agent  i  sends  an  acceptance  to  that  particular 
requesting  neighbor  whose  label  is  closest  to  the 
front  of  the  queue  q j(i). 

b)  If  agent  i  either  has  placed  a  request  to  gossip  or 
has  not  received  any  request  to  gossip,  then  agent 
i  does  not  send  out  an  acceptance. 

4)  Gossip  variable  and  queue  updates: 

a)  If  agent  i  either  sends  an  acceptance  to  or  receives 
an  acceptance  from  neighbor  j,  then  agent  i 
gossips  with  neighbor  j  by  setting 


Xi(t+  1)  = 


Xi(t)  +  Xj(t) 
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Agent  i  updates  its  queue  by  moving  j  and  the 
labels  of  all  of  its  current  receivers  k,  if  any,  for 
which  Xk(t)  =  Xi(t)  from  their  current  positions 
in  q i(t)  to  the  end  of  the  queue  while  maintaining 
their  relative  order. 

b)  If  agent  i  has  not  sent  out  an  acceptance  nor 
received  one,  then  agent  i  does  not  update  the 
value  of  Xi(t).  In  addition,  q i(t)  is  not  updated 
except  when  agent  i’s  gossip  value  equals  that 
of  at  least  one  of  its  current  receivers.  In  this 
special  case  agent  i  moves  the  labels  of  all  of  its 
current  receivers  k  for  which  Xk{t)  =  Xi(t)  from 
their  current  positions  in  q j(i)  to  the  end  of  the 
queue,  while  maintaining  their  relative  order. 

In  summary, 

•  For  agent  i  to  place  a  request  to  gossip,  the  current 
value  of  its  gossip  variable  must  be  larger  than  that  of 
its  current  preferred  neighbor. 

•  For  a  gossip  to  occur  between  two  agents  i  and  j  at  time 
t,  one  -  say  i  -  must  be  the  current  preferred  neighbor 


of  the  other  {i.e.,  i  =  j*(t)},  Xj(t)  must  be  larger  than 
Xi(t),  and  j  must  be  the  label  of  the  neighbor  of  agent 
i  with  highest  priority  which  is  placing  a  request  to 
gossip  with  agent  i. 

•  For  agent  i  to  update  its  queue,  it  must  either  gossip 
with  a  neighbor  j  or,  if  not,  it’s  current  gossip  value 
must  equal  that  of  at  least  one  of  its  receivers. 
Transmissions  required:  During  step  1,  each  agent  sends  a 
transmission  to  its  preferred  neighbor  so  the  total  number  of 
transmissions  required  for  all  n  agents  to  complete  step  1  is 
n.  During  step  2,  each  neighbor  of  agent  i  which  has  agent  i 
as  its  current  preferred  neighbor  sends  a  transmission  to  agent 
i  so  the  total  of  transmissions  required  for  all  n  agents  to 
complete  step  2  is  also  n.  The  total  number  of  transmissions 
of  all  agents  required  to  complete  step  3a  is  clearly  no 
greater  than  Thus  the  total  number  of  transmissions  per 
iteration  to  carry  out  the  protocol  just  described  is  no  greater 
than  §71.  With  a  broadcasting  protocol  such  as  the  one 
considered  in  [16]  the  total  number  of  transmissions  per 
iteration  is  ndav g  where  davg  is  the  average  vertex  degree 
of  the  underlying  graph  A.  Thus  for  allowable  gossip  graphs 
with  average  vertex  degree  exceeding  § ,  fewer  transmissions 
are  required  per  iteration  to  do  averaging  with  the  protocol 
under  consideration  than  are  required  per  iteration  to  do 
averaging  via  broadcasting. 

Theorem  2:  Every  sequence  of  gossip  vectors  x(t),  t  >  0 
generated  by  protocol  III  is  repetitively  complete  with  period 
no  greater  than  the  number  of  edges  of  A. 

Theorems  1  and  2  thus  imply  that  every  sequence  of  gossip 
vectors  generated  by  protocol  III  converges  to  the  desired 
limit  point  exponentially  fast  at  a  rate  no  worse  that  some 
finite  number  A  <  1  which  depends  only  A.  Calculation  of 
this  worst  case  bound  is  a  subject  for  future  research. 

To  prove  Theorem  2,  we  need  a  few  ideas.  First  note  that 
step  4  of  the  protocol  stipulates  that  agent  i  must  update 
its  queue  whenever  its  current  gossip  value  equals  that  of 
one  of  its  neighbors.  We  say  that  agent  i  gossips  virtually 
with  neighbor  j  at  time  t  if  the  current  gossip  values  of  both 
agents  are  the  same.  Note  that  while  an  agent  can  gossip 
with  at  most  one  agent  at  time  t,  it  can  gossip  virtually 
with  as  many  as  rii  at  the  same  time.  To  proceed,  we  need 
to  generalize  slightly  the  idea  of  a  round.  We  say  that  an 
agent  i  has  completed  a  round  of  gossiping  after  it  has 
gossiped  or  virtually  gossiped  with  each  neighbor  in  A/}  at 
least  once.  Thus  the  finite  sequence  of  primitive  gossiping 
matrices  corresponding  to  a  finite  sequence  of  multi-gossips 
and  virtual  multi-gossips  for  the  entire  group  which  has 
occurred  over  an  interval  of  length  T,  will  be  complete  if 
over  the  same  period  each  agent  in  the  group  completes  a 
round.  Thus  Theorem  2  will  be  true  if  every  agent  completes 
a  round  in  a  number  of  iterations  no  larger  than  the  number 
of  edges  of  A.  The  following  proposition  asserts  that  this  is 
in  fact  the  case. 

Proposition  2:  Let  m  be  the  number  of  edges  in  A. 
Then  within  m  iterations  every  agent  will  have  gossiped  or 
virtually  gossiped  at  least  once  with  each  of  its  neighbors. 

To  prove  this  proposition  we  will  make  use  of  the  follow- 
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ing  two  lemmas. 

Lemma  1:  Suppose  that  all  n  agents  follow  protocol  III. 
Then  at  each  time  t,  at  least  one  gossip  or  virtual  gossip 
must  occur. 

Lemma  2:  Let  t  be  fixed  and  suppose  that  G  is  a  spanning 
subgraph  of  A  with  at  least  one  edge.  For  each  i  £ 
{1,2, ...  ,n}  write  Mi  for  the  set  of  labels  of  the  vertices 
adjacent  to  vertex  i  in  A  and  AG  for  the  set  of  labels 
of  the  vertices  adjacent  to  vertex  i  in  G.  Let  Mi  —  AG 
denote  the  complement  of  AG  in  A";.  Suppose  that  for  each 
*€{1,2,...,  n},  each  label  in  AG,  if  any,  is  closer  to  the 
front  of  (hit)  than  all  the  labels  in  J\fi— Mi-  Then  there  must 
be  an  edge  (i,j)  within  G  such  that  at  time  t,  neighboring 
agents  i  and  j  either  gossip  or  gossip  virtually. 

The  proofs  of  Lemma  1  and  Lemma  2  are  omitted  due  to 
space  limitations;  they  will  be  given  in  a  full  length  version 
of  this  paper. 

Proof  of  Proposition  2:  If  m  =  1,  there  can  be  only  two 
agents  so  n  =  2.  In  view  of  Lemma  1,  Proposition  2  must 
clearly  be  true  for  this  case. 

Suppose  to  >  1.  Fix  t  and  let  £q  be  the  edge  set  of  A.  For 
k  €  {1,2,..., to}  let  £k  denote  the  set  of  all  edges  (i,j) 
in  £0  for  which  agents  i  and  j  have  gossiped  or  virtually 
gossiped  at  least  once  within  k  iterations  starting  at  time  t. 
Fix  k.  If  £k  =  £o,  then  each  agent  will  have  gossiped  or 
virtually  gossiped  at  least  once  with  each  of  its  neighbors 
within  k  iterations.  Since  k  <  to,  each  agent  will  have 
gossiped  or  virtually  gossiped  at  least  once  with  each  of  its 
neighbors  within  m  iterations  starting  at  time  t. 

Now  suppose  that  £k  G  £q  in  which  case  the  complement 
of  £k  in  £q,  namely  £q  —  £j~,  is  nonempty.  Let  G&  denote 
the  spanning  subgraph  of  A  with  edge  set  £q  —  £k.  For  each 
*€{l,2,...,n}  write  A}  for  the  set  of  labels  of  the  vertices 
adjacent  to  vertex  i  in  A  and  AL,  for  the  set  of  labels  of  the 
vertices  adjacent  to  vertex  i  in  G&.  Let  A}  —  AG  denote  the 
complement  of  A G  in  A/j.  For  each  label  j  £  Mi  —  Mli, 
if  any,  (i,j)  £  £k  which  means  that  agents  i  and  j  have 
gossiped  or  virtually  gossiped  at  least  once  within  k  iterations 
starting  at  time  t.  On  the  other  hand,  if  there  is  vertex  j  £ 
Mi,  then  this  vertex  labels  an  agent  which  has  not  gossiped 
or  virtually  gossiped  with  agent  i  within  k  iterations.  Protocol 
III  stipulates  that  each  label  in  AG;  is  closer  to  the  front  of 
q; (f  +  k)  than  all  the  labels  in  Mi  —  Mi.  Since  i  is  arbitrary, 
this  is  true  for  all  i  £  {1,  2, . . . ,  n}.  It  follows  from  Lemma 
2  that  there  is  an  edge  (a,  b )  in  £0  —  £k  such  that  at  time 
t  +  k,  neighboring  agents  a  and  b  either  gossip  or  virtually 
gossip. 

By  hypothesis  £k  ^  £o-  Since  £k  D  £j  for  j  £ 
{1,2, ...  ,k  —  1},  it  must  be  true  that  £j  ^  £q  for  j  £ 
{1,2, ...,fc}.  Thus  the  preceding  argument  applies  for  all 
j  £  {1,2,...,  fe),  so  for  each  such  j  there  must  be  an  edge 
(a,j,bj)  in  £q  —  £j  such  that  at  time  t  +  j,  neighboring 
agents  a:l  and  bj  either  gossip  or  virtually  gossip.  Clearly 
(aj,bj)  <£.  {{a1,bi),(a2,b2),...,(aj^1,bj-1)}  for  j  £ 
{2,3,  ...,k}  because  {(oi,  bi),  (a2,  b2), . . . ,  (aj-i,  6j_i)} 
C  £j  and  ( aj,bj )  £  £q  —  £j.  It  follows  that 

(ai,6i),  (a2,  b2), ... ,  ( ak,bk )  are  distinct  edges  in  A. 


The  preceding  argument  implies  that  at  the  end  of  k 
iterations,  either  £k  =  £0  or  k  distinct  gossips/virtual  gossips 
have  taken  place.  If  the  former  is  true,  then  each  agent 
will  have  gossiped  or  virtually  gossiped  at  least  once  with 
each  of  its  neighbors  within  k  and  therefore  to  iterations. 
If  the  later  is  true,  then  k  must  be  less  than  to  and  k 
distinct  gossips/virtual  gossips  will  have  taken  place  within 
k  iterations.  Clearly  this  process  can  be  continued  until  for 
some  integer  k  <  m,  £k  =  £o  in  which  case  each  agent  will 
have  gossiped  or  virtually  gossiped  at  least  once  with  each 
of  its  neighbors  within  k  and  therefore  to  iterations.  ■ 

IV.  Concluding  Remarks 

One  of  the  problems  with  the  idea  of  gossiping,  which 
apparently  is  not  widely  appreciated,  is  that  it  is  difficult 
to  devise  provably  correct  gossiping  protocols  which  are 
guaranteed  to  avoid  deadlocks  without  making  restrictive 
assumptions.  The  research  in  this  paper  and  in  [11]  and  [16] 
contributes  to  our  understanding  of  this  issue  and  how  to 
deal  with  it. 
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ABSTRACT  |  For  the  purposes  of  this  paper,  “gossiping”  is  a 
distributed  process  whose  purpose  is  to  enable  the  members  of 
a  group  of  autonomous  agents  to  asymptotically  determine,  in 
a  decentralized  manner,  the  average  of  the  initial  values  of 
their  scalar  gossip  variables.  This  paper  discusses  several  dif¬ 
ferent  deterministic  protocols  for  gossiping  which  avoid  dead¬ 
locks  and  achieve  consensus  under  different  assumptions.  First 
considered  is  T-periodic  gossiping  which  is  a  gossiping  pro¬ 
tocol  which  stipulates  that  each  agent  must  gossip  with  the 
same  neighbor  exactly  once  every  T  time  units.  Among  the 
results  discussed  is  the  fact  that  if  the  underlying  graph  char¬ 
acterizing  neighbor  relations  is  a  tree,  convergence  is  expo¬ 
nential  at  a  worst  case  rate  which  is  the  same  for  all  possible 
T-periodic  gossip  sequences  associated  with  the  graph.  Many 
gossiping  protocols  are  request  based  which  means  simply 
that  a  gossip  between  two  agents  will  occur  whenever  one  of 
the  two  agents  accepts  a  request  to  gossip  placed  by  the  other. 
Three  deterministic  request-based  protocols  are  discussed. 
Each  is  guaranteed  to  not  deadlock  and  to  always  generate 
sequences  of  gossip  vectors  which  converge  exponentially  fast. 
It  is  shown  that  worst  case  convergence  rates  can  be  char- 


Manuscript  received  May  20,  2010;  revised  December  6,  2010;  accepted  May  25,  2011. 
Date  of  publication  August  4,  2011;  date  of  current  version  August  19,  2011.  The  work 
of  L.  Jiu,  S.  Mou,  and  A.  S.  Morse  was  supported  by  the  U.S.  Army  Research  Office, 
the  U.S.  Air  Force  Office  of  Scientific  Research,  and  the  National  Science  Foundation. 
The  work  of  B.  D.  O.  Anderson  was  supported  by  the  Australian  Research  Council's 
Discovery  Project  DP-110100538  and  the  National  ICT  Australia — NICTA.  NICTA  is 
funded  by  the  Australian  Government  as  represented  by  the  Department  of 
Broadband,  Communications  and  the  Digital  Economy  and  the  Australian  Research 
Council  through  the  ICT  Centre  of  Excellence  program.  The  work  of  C.  B.  Yu  was 
supported  by  the  Australian  Research  Council  through  a  Queen  Elizabeth  II  Fellowship 
and  DP-110100538  and  by  the  Overseas  Expert  Program  of  Shandong  Province,  China. 
The  work  of  B.  D.  O.  Anderson  and  C.  B.  Yu  was  also  supported  by  the  U.S.  Air  Force 
Research  laboratory  Grant  FA2386-10-1-4102. 

J.  Liu,  S.  Mou,  and  A.  S.  Morse  are  with  Yale  University,  New  Haven,  CT  06520  USA 
(e-mail:  ji. Iiu@yale.edu;  shaoshuaimou@yale.edu;  as.morse@yale.edu). 

B.  D.  O.  Anderson  is  with  the  Australian  National  University,  Canberra,  A.C.T.  0200, 
Australia  and  also  with  the  National  ICT  Australia  (NICTA),  Canberra  Research 
Laboratory,  Canberra,  A.C.T.  2601,  Australia. 

C.  Yu  is  with  the  Australian  National  University,  Canberra,  A.C.T.  0200,  Australia,  the 
National  ICT  Australia  (NICTA),  Canberra  Research  Laboratory,  Canberra,  A.C.T.  2601, 
Australia,  and  also  Shandong  Computer  Center,  Jinan,  China. 

Digital  Object  Identifier:  10.1109/JPROC.2011.2159689 


acterized  in  terms  of  the  second  largest  singular  values  of 
suitably  defined  doubly  stochastic  matrices. 

KEYWORDS  |  Consensus;  distributed  averaging;  nonhomoge- 
neous  Markov  chains;  stochastic  matrices 

I.  INTRODUCTION 

There  has  been  considerable  interest  recently  in  develop¬ 
ing  algorithms  for  distributing  information  among  the 
members  of  a  group  of  sensors  or  mobile  autonomous 
agents  via  local  interactions.  Notable  among  these  are 
those  algorithms  intended  to  cause  such  a  group  to  reach  a 
consensus  in  a  distributed  manner  [1] — [7] .  One  particular 
type  of  consensus  process  which  has  received  much 
attention  lately  is  called  distributed  averaging  [8].  In  its 
simplest  form,  distributed  averaging  deals  with  a  network 
of  n  >  1  agents  and  the  constraint  that  each  agent  i  is  able 
to  communicate  only  with  certain  other  agents  called 
agent  i’s  neighbors.  Neighbor  relations  are  described  by  a 
simple,  connected  graph  N  in  which  vertices  correspond 
to  agents  and  edges  indicate  neighbor  relations.  Thus,  the 
neighbors  of  an  agent  i  have  the  same  labels  as  the  vertices 
in  N  which  are  adjacent  to  vertex  i.  Initially,  each  agent  i 
has  or  acquires  a  real  number  yt  which  might  be  a 
measured  temperature  or  something  similar.  The  distrib¬ 
uted  averaging  problem  is  to  devise  a  protocol  which  will 
enable  each  agent  to  compute  the  average  yavg  = 
(l/n)  Y’j'L,  Yi  using  only  information  acquired  from  its 
neighbors.  There  are  many  variants  of  this  problem.  For 
example,  instead  of  real  numbers,  the  yj  may  be  integer¬ 
valued  [9].  Another  variant  assumes  that  the  edges  of  N 
change  over  time  [10].  This  paper  considers  the  case  when 
the  yf  are  real  numbers  and  N  does  not  depend  on  time. 

As  noted  in  [8],  the  distributed  averaging  problem  can 
be  solved,  in  principle,  by  “flooding”;  that  is,  by  propa¬ 
gating  across  the  network  over  time  the  values  of  all  of  the 
yt.  Armed  with  knowledge  of  all  of  these  values,  each  agent 
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is  thus  able  to  compute  yav g.  A  more  sophisticated 
approach  to  the  problem  is  for  each  agent  to  use  a  linear 
iterative  update  rule  of  the  general  form 

Xi(t  +  1)  =  WiiXi(t)  +  ^2  WijXj(t),  Xi(l)  =  y{ 

jeAft 

where  t  is  a  discrete  time  index,  Xj(t)  is  agent  i’s  current 
estimate  of  yav g,  the  uitj  are  real-valued  weights,  and  A/";  is 
the  set  of  labels  of  the  neighbors  of  agent  i.  In  [8],  several 
methods  are  proposed  for  choosing  the  w; j.  One  particular 
choice,  which  defines  what  has  come  to  be  known  as  the 
Metropolis  algorithm,  requires  only  local  information  to 
define  the  w,p  Algorithms  of  this  type,  which  require  each 
agent  to  communicate  with  all  of  its  neighbors  on  each 
iteration,  are  sometimes  called  broadcast  algorithms. 

An  alternative  approach  to  distributed  averaging, 
which  typically  does  not  involve  broadcasting,  exploits  a 
form  of  “gossiping”  [11]  specifically  tailored  to  the  distri¬ 
buted  averaging  problem.  The  idea  of  gossiping  is  very 
simple.  A  pair  of  neighbors  with  labels  i  and  j  are  said  to 
gossip  at  time  t  if  both  Xj(t  +  l)  and  Xj(t  +  1)  are  set  equal 
to  the  average  of  x;(t)  and  Xj(t ).  Each  agent  is  allowed  to 
gossip  with  at  most  one  neighbor  at  one  time.  Under  ap¬ 
propriate  assumptions,  algorithms  which  possess  this  sim¬ 
ple  property  can  be  shown  to  solve  the  distributed 
averaging  problem.  Gossiping  algorithms  do  not  necessar¬ 
ily  involve  broadcasting  and  thus  have  the  potential  to 
require  less  transmissions  per  iteration  than  broadcast 
algorithms.  Of  course,  one  would  not  expect  gossip  algo¬ 
rithms  to  converge  as  fast  as  broadcast  algorithms. 

Gossiping  might  find  application  in  many  contexts.  For 
example,  suppose  that  a  spatially  distributed  network  of 
temperature  sensors  has  been  deployed  in  such  a  way  so 
that  each  sensor  can  communicate  with  nearby  sensors 
(i.e.,  neighbors).  Suppose  that  at  some  specific  time  all 
sensors  take  readings  and  use  these  readings  as  initial 
estimates  of  the  average  temperature  across  the  network. 
At  subsequent  clock  times,  each  sensor  then  passes  its 
current  estimate  to  one  of  its  neighbors  who  in  turn  uses 
this  estimate  to  update  its  own  estimate  with  the  goal  of 
ultimately  arriving  at  the  average  value  of  the  temperature 
across  the  network.  Gossiping  is  a  process  for  recursively 
carrying  out  these  computations. 

Implementation  of  any  gossiping  protocol  necessarily 
involves  some  degree  of  centralization.  In  particular,  for 
the  aforementioned  sensor  network  averaging  task  to  make 
sense  each  sensor  must  be  instructed  by  a  centralized 
manager  to  take  a  temperature  reading  at  the  same  time 
t  =  ti  as  the  rest.  In  some  cases,  it  may  be  useful  to  take 
centralization  further.  For  example,  to  facilitate  gossiping 
it  may  be  helpful  in  some  instances  to  centrally  define  a 
network-wide  sensor  ordering  by  assigning  offline  to  each 
sensor  a  unique  priority  number  with  the  understanding 
that  each  sensor  can  make  use  of  the  priority  numbers  of 


its  neighbors  to  carry  out  its  role  in  the  gossiping  process 
[12].  Another  idea  requiring  some  degree  of  centralization 
is  to  assign  offline  to  each  agent,  a  specific  sequence  of 
times  at  which  the  agent  may  gossip  (Section  IV-A).  One 
might  carry  centralization  one  step  further  by  specifying 
offline  for  each  agent  which  neighbor  the  agent  is  to  gossip 
with  at  each  clock  time  (Section  III).  This  of  course  could 
be  done  only  in  a  network  whose  population  of  sensors 
does  not  change  over  time  in  a  predictable  manner.  For 
networks  whose  members  change  with  time,  one  might 
consider  other  ideas  and  assumptions.  For  example,  for  the 
temperature  sensing  network,  one  might  try  implementing 
a  gossiping  protocol  which  assumes  that  at  each  clock  time 
each  sensor  can  acquire  the  current  temperature  estimates 
of  all  of  its  neighbors  [13].  A  refinement  of  this  protocol 
which  requires  each  agent  to  acquire  the  current  tem¬ 
perature  of  only  a  subset  of  its  neighbors  at  each  clock  time 
is  discussed  in  Section  IV-A.  In  the  end,  which  assump¬ 
tions  make  sense  depends  on  the  specific  application. 

Although  gossiping  is  a  form  of  consensus,  it  differs 
from  consensus  in  several  important  ways.  First,  the  goal 
of  consensus  is  to  agree  on  the  value  of  some  quantity 
whereas  the  goal  of  gossiping  is  to  compute  the  average  of 
the  initial  values  of  the  Xj,  henceforth  called  gossip  varia¬ 
bles.  Second,  unlike  consensus  processes,  gossiping  pro¬ 
cesses  are  invariably  designed  so  that  the  sum  total  of  all 
gossip  variables  remains  constant  from  clock  time  to  clock 
time.  This  has  a  simple  but  important  consequence:  If  the 
sum  total  of  all  gossip  variables  remains  constant  and  a 
consensus  is  reached  in  that  all  gossip  variables  converge 
to  the  same  value,  then  this  value  must  be  the  average  of 
the  initial  values  of  all  gossip  variables  in  the  network.  In 
gossiping,  the  sum  total  of  all  gossip  variables  is  kept  con¬ 
stant  by  requiring  agents  to  always  gossip  in  pairs  using 
averaging.  Although  this  simple  idea  keeps  constant  the  sum 
of  gossip  variables  across  the  network,  the  idea  comes  with  a 
price  in  that  a  deadlock  may  well  occur  unless  specific 
precautions  are  built  into  the  protocol  to  preclude  this. 

The  specific  sequence  of  gossips  which  occurs  during  a 
given  gossiping  process  might  be  determined  either 
probabilistically  [11],  [14]  or  deterministically  [12],  [15], 
depending  on  the  problem  of  interest.  Deterministic  gos¬ 
siping  protocols  are  intended  to  guarantee  that  under  all 
conditions,  a  consensus  will  be  achieved  asymptotically 
whereas  probabilistic  protocols  aim  at  achieving  consensus 
asymptotically  with  probability  one.  Both  approaches  have 
merit.  The  probabilistic  approach  is  typically  somewhat 
easier  both  in  terms  of  algorithm  development  and  conver¬ 
gence  analysis.  On  the  other  hand,  the  deterministic  ap¬ 
proach  forces  one  to  consider  worst  case  scenarios  and  has 
the  potential  of  yielding  algorithms  which  may  outperform 
those  obtained  using  the  probabilistic  approach.  This  paper 
takes  the  deterministic  approach. 

Of  particular  interest  is  the  rate  at  which  a  sequence  of 
agent  gossip  variables  converge  to  a  common  value.  The 
convergence  rate  question  for  more  general  deterministic 
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consensus  problems  has  been  studied  in  [16]  and  [17].  In 
[11]  and  [14],  the  convergence  rate  question  is  addressed 
for  gossiping  algorithms  in  which  the  sequence  of  gossip 
pairs  under  consideration  is  determined  probabilistically. 
A  modified  gossiping  algorithm  intended  to  speed  up  con¬ 
vergence  is  proposed  in  [18]  without  proof  of  correctness, 
but  with  convincing  experimental  results.  The  algorithm 
has  recently  been  analyzed  in  [19].  Recent  results  concern¬ 
ing  convergence  rates  appear  in  [15],  [20],  and  [21]  for 
periodic  gossiping  and  in  [22]  — [24]  for  deterministic  ape¬ 
riodic  gossiping.  This  paper  presents  a  more  comprehen¬ 
sive  treatment  of  the  work  in  [15]  and  [24]. 

A  typical  gossiping  process  can  usually  be  modeled 
as  a  discrete  time  linear  system  of  the  form  x(t  +  l)  = 
M(t)x(t),  t  =  1,2, .  .  .,  where  x  is  a  vector  of  agent  gossip 
variables  x ;  and  each  value  of  M(t)  is  a  specially  structured 
doubly  stochastic  matrix  (Section  II).  Roughly  speaking,  a 
finite  sequence  of  gossip  pairs  is  “complete”  if  the  corre¬ 
sponding  set  of  edges  in  M  forms  a  connected  spanning 
subgraph.  A  complete  gossip  sequence  is  minimally  com¬ 
plete  if  there  is  no  other  complete  gossip  sequence  of 
shorter  length  (Section  II-B).  An  infinite  sequence  of 
gossips  is  “repetitively  complete”  with  period  T  if  each 
successive  subsequence  of  gossips  of  length  T  in  the 
sequence  is  complete.  The  gossip  variable  sequences  asso¬ 
ciated  with  repetitively  complete  gossip  sequences  con¬ 
verge  exponentially  fast  (Section  II-C).  Repetitively 
complete  gossip  sequences  which  are  also  periodic  with 
period  T  are  treated  in  Section  III.  The  worst  case 
convergence  rate  of  any  such  sequence  is  determined  by  T 
and  by  the  second  largest  eigenvalue  (in  magnitude)  of  the 
stochastic  matrix  which  the  gossips  define  over  a  period.  In 
the  case  when  N  is  a  tree  and  the  sequence  of  gossips  over  a 
period  is  minimally  complete,  the  value  of  this  eigenvalue 
does  not  depend  on  the  order  in  which  gossips  over  a  period 
take  place.  A  proof  of  this  surprising  result  is  given  in  [25]. 

Most  gossiping  protocols  are  “request-based.”  By 
request-based  gossiping  is  meant  a  gossiping  process  in 
which  a  gossip  occurs  between  two  agents  whenever  one  of 
the  two  accepts  a  request  to  gossip  placed  by  the  other 
(Section  IV).  An  agent’s  “event  times”  are  the  times  at 
which  the  agent  makes  requests  to  gossip.  In  Section  IV-A, 
a  request-based  protocol  is  given  which  generates  repeti¬ 
tively  complete  (and  thus  exponentially  convergent)  gossip 
sequences  under  the  assumption  that  the  event  times  of 
each  agent  are  different  than  the  event  times  of  all  of  its 
neighbors.  A  more  refined  repetitively  complete  gossiping 
protocol  not  requiring  this  assumption  is  discussed  in 
Section  IV-A.  The  protocol  is  inspired  by  ideas  put  forth 
in  [13]. 

It  is  shown  in  Section  IV-C  that  the  worst  case 
convergence  rate  of  a  repetitively  complete  gossip 
sequence  with  period  T  can  be  characterized  in  terms  of 
a  suitably  defined  seminorm  of  the  stochastic  matrix  S 
determined  by  the  subsequence  of  gossips  occurring  over  a 
given  period.  A  specific  goal  of  this  paper  is  to  find  a 


seminorm  with  respect  to  which  S  is  a  contraction.  The 
role  played  by  seminorms  in  characterizing  convergence 
rate  is  explained  in  Section  IV-C.  Three  different  types  of 
seminorms  are  considered  in  Section  IV-C.  Each  is 
compared  to  the  well-known  coefficient  of  ergodicity 
which  plays  a  central  role  in  the  study  of  convergence  rates 
for  nonhomogeneous  Markov  chains  [26].  Somewhat  sur¬ 
prisingly,  it  turns  out  that  a  particular  Euclidean  semi¬ 
norm  on  R"x"  has  the  required  property — namely  that  in 
this  seminorm,  the  stochastic  matrix  S  determined  by  any 
complete  gossip  sequence  is  a  contraction  (Section  IV-D). 
This  particular  seminorm  turns  out  to  be  the  second  largest 
singular  value  of  S. 

II.  GOSSIPING 

Gossiping,  as  used  here,  is  a  form  of  distributed  compu¬ 
tation  whose  purpose  is  to  calculate  the  average  value  of  a 
set  of  numbers  or  measurements  existing  at  the  “nodes”  of 
a  distributed  network.  The  type  of  gossiping  we  want  to 
consider  involves  a  group  of  n  >  1  agents  labeled  1  to  n. 
Each  agent  i  has  control  over  a  real-valued  scalar  quantity  x: 
called  a  gossip  variable  which  the  agent  is  able  to  update 
from  time  to  time.  A  gossip  between  agents  i  and  j,  written 
(i,j),  occurs  at  time  t  G  { 1,2, .  .  .}  if  the  values  of  both 
agents’  variables  at  time  t  +  1  equal  the  average  of  their 
values  at  time  t.  In  other  words  Xj(t+  l)  =  Xj(t  +  l)  = 
(l/2)(xj(t)  +  Xj(t )).  If  agent  i  does  not  gossip  at  time  t,  its 
gossip  variable  does  not  change;  thus  in  this  case 
Xj(t  +  l)  =  Xj(t).  Agents  may  only  gossip  with  their  neigh¬ 
bors.  Agent  j  is  a  neighbor  of  agent  i  if  ( i,j )  is  an  edge  in  a 
given  simple,  undirected,  n-vertex  graph  N  called  a  neigh¬ 
bor  graph;  we  use  the  symbol  A/";  to  denote  the  set  of  labels 
of  the  neighbors  of  agent  i.  One  rule  which  sharply  dis¬ 
tinguishes  a  gossiping  process  from  a  more  general  con¬ 
sensus  process  is  that  in  the  case  of  gossiping,  each  agent  is 
allowed  to  gossip  with  at  most  one  of  its  neighbors  at  one 
time.  This  rule  does  not  preclude  the  possibility  of  two  or 
more  pairs  of  agents  gossiping  at  the  same  time,  provided 
each  of  the  two  pairs  have  no  agent  in  common.  More 
precisely,  two  gossip  pairs  ( i,j )  and  (fe,  m)  are  noninteract¬ 
ing  if  neither  i  nor  j  equals  either  k  or  m.  When  multiple 
noninteracting  pairs  of  gossips  occur  simultaneously,  the 
simultaneous  occurrence  of  all  such  gossips  is  called  a 
multigossip.  In  other  words,  a  multigossip  at  time  t  is  the  set 
of  all  gossips  which  occur  at  time  t  with  the  understanding 
that  any  two  such  pairs  are  noninteracting.  A  central  goal  of 
gossiping  is  for  the  n  agents  to  reach  a  consensus  in  the 
sense  that  all  n  gossip  variables  ultimately  reach  the  same 
value  in  the  limit  as  t  — >  oo.  For  this  to  be  possible,  no 
matter  what  the  initial  values  of  the  gossip  variables  are,  it 
is  clearly  necessary  that  N  be  a  connected  graph.  We 
assume  that  this  is  so. 

A  gossiping  process  can  be  modeled  by  a  discrete  time 
linear  system  of  the  form  x(t  +  l)  =  M(t)x(t),  t  =  1, 2, . . ., 
where  x  €  R"  is  a  state  vector  of  gossip  variables  and  M(t) 
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is  a  matrix  characterizing  how  x  changes  as  the  result  of  the 
gossips  which  take  place  at  time  t.  If  a  single  pair  of  distinct 
agents  i  and  j  gossip  at  time  t  >  1,  then  M(t)  =  Pjj  where 
is  the  n  X  n  matrix  for  which  pit  =  p,;  =  p/j  =  pjj  =  (1/2), 
pkk  =  1,  fe  ^  { i ,  7 }  and  all  remaining  entries  equal  zero.  We 
call  such  Pjj  single-gossip  matrices.  It  will  be  convenient  to 
include  in  the  set  of  single-gossip  matrices,  the  n  X  n 
identity  matrix;  the  identity  matrix  can  be  thought  of  as  the 
correct  update  matrix  to  model  the  situation  when  no 
gossips  occur  at  time  t.  If  at  time  t  a  multigossip  occurs, 
then  as  a  consequence  of  noninteraction,  M(t)  is  simply 
the  product  of  the  single-gossip  matrices  corresponding  to 
the  individual  gossips  comprising  the  multigossip;  more¬ 
over  because  of  noninteraction,  the  single-gossip  matrices 
in  the  product  commute  with  each  other  and  so  any  given 
permutation  of  the  single-gossip  matrices  in  the  product 
determines  the  same  matrix  P.  We  refer  to  P  as  the  gossip 
matrix  determined  by  the  multigossip  under  consideration. 
Thus,  for  example,  if  at  time  t  agents  a,  fa,  and  c  gossip  with 
agents  d,  e,  and  f,  respectively,  then  the  gossip  matrix 
determined  is  P  =  PadPbePcf ■  Thus,  in  this  case, 
M(t)  =  P  and  x(t  +  l)  =  M(t)x(t). 

A.  Doubly  Stochastic  Matrices 

Each  single-gossip  matrix  is  a  nonnegative  matrix 
whose  row  sums  and  column  sums  all  equal  one.  Matrices 
with  these  two  properties  are  called  doubly  stochastic.  Note 
that  the  type  of  doubly  stochastic  matrices  which  charac¬ 
terize  single  gossips  (i.e.,  single-gossip  matrices)  has  two 
additional  properties — it  is  symmetric  and  its  diagonal 
entries  are  all  positive.  The  same  is  true  for  the  type  of 
doubly  stochastic  matrices  which  characterize  multigos¬ 
sips.  Doubly  stochastic  matrices  are  special  types  of 
“stochastic  matrices”  where  by  a  stochastic  matrix  is  meant 
a  nonnegative  n  X  n  matrix  whose  row  sums  all  equal  one. 
It  is  easy  to  see  that  a  nonnegative  matrix  S  is  stochastic  if 
and  only  if  SI  =  1  where  1  £  R"  is  a  column  vector  whose 
entries  are  all  ones.  Similarly,  a  nonnegative  matrix  S  is 
doubly  stochastic  if  and  only  if  SI  =  1  and  SM  =  1.  Using 
these  characterizations  it  is  easy  to  prove  that  the  class  of 
stochastic  matrices  in  Rnxn  is  compact  and  closed  under 
multiplication  as  is  the  class  of  doubly  stochastic  matrices 
in  R"xn.  It  is  also  true  that  the  class  of  nonnegative 
matrices  in  Rnxn  with  positive  diagonal  entries  is  closed 
under  multiplication. 

Mathematically,  reaching  a  consensus  by  means  of  an 
infinite  sequence  of  gossips  or  multigossips  modeled  by  a 
corresponding  infinite  sequence  of  gossip  matrices  M(l), 
M(2),M(3), . . .  means  that  the  sequence  of  matrix  pro¬ 
ducts  M(l),M(2)M(l),M(3)M(2)M(l), . .  .  converges  to  a 
matrix  of  the  form  lc.  It  turns  out  that  if  convergence 
occurs,  the  limit  matrix  lc  is  also  a  doubly  stochastic  ma¬ 
trix;  this  means  that  c  =  (l/n)l/  and  consequently  that  all 
n  gossip  variables  will  have  converged  to  the  average  of 
their  initial  values.  This  particular  fact  further  distin¬ 
guishes  a  gossiping  process  from  a  more  general  consensus 


process,  since  in  a  consensus  process  the  value  to  which  all 
consensus  variables  typically  converge  is  not  necessarily 
the  average  of  their  initial  values. 

B.  Gossiping  Sequences 

By  a  gossiping  sequence  is  meant  a  sequence  of  individual 
gossips  corresponding  to  some  or  all  of  the  edges  in  a  given 
neighbor  graph  N.  Corresponding  to  any  given  sequence  of 
gossips  (h,ji),  (12,(2);  ■  •  •  is  a  sequence  of  single-gossip 
matrices  PIljl ,  Pi2j2 , .  .  .  whose  product  •  ■  ■  Pj2j2P,ljl  defines  the 
mapping  which  assigns  to  any  given  initial  vector  of  gossip 
variables,  the  vector  of  gossip  variables  which  results  from 
the  gossips  in  the  sequence.  We  call  any  such  matrix  pro¬ 
duct  a  gossip  matrix.  It  is  thus  clear  that  a  given  neighbor 
graph  has  associated  with  it  a  family  of  gossip  matrices 
whose  members  are  all  products  of  all  combinations  of 
single-gossip  matrices  of  all  lengths.  These  are  the  gossip 
matrices  determined  by  M.  Conversely,  any  given  sequence 
of  individual  gossips  (or  corresponding  product  of  single¬ 
gossip  matrices)  induces  a  spanning  subgraph  of  N  whose 
edges  correspond  to  the  gossips  in  the  sequence.  We  say 
that  a  gossip  sequence  or  corresponding  gossip  matrix  is 
complete  if  the  graph  the  gossips  in  the  sequence  induce  is  a 
connected  spanning  subgraph  within  N.  A  gossip  sequence 
and  corresponding  gossip  matrix  is  minimally  complete,  if  it 
is  complete  and  if  there  is  no  other  complete  gossip  se¬ 
quence  of  shorter  length.  It  is  easy  to  see  that  a  nonre- 
dundant1  gossip  sequence  is  minimally  complete  if  and  only 
if  the  subgraph  of  M  that  it  induces  is  a  minimal  spanning 
tree  in  N.  In  the  special  but  important  case  when  N  is  itself  a 
tree  T,  more  can  be  said.  In  this  case,  a  minimally  complete 
gossip  sequence  is  one  in  which,  for  each  edge  in  T,  there  is 
exactly  one  corresponding  individual  gossip  in  the  sequence. 

The  preceding  ideas  extend  in  a  natural  way  to  se¬ 
quences  of  multigossips.  Corresponding  to  any  given  se¬ 
quence  of  multigossips  71,72, .  .  .  is  a  sequence  of  gossip 
matrices  Qi,  Q2,  •  ■  •  where  Q,  is  the  product  of  the  single¬ 
gossip  matrices  of  the  individual  gossips  in  the  ith  multi¬ 
gossip  in  the  sequence.  The  product  •  •  •  Q2Q1  thus  defines 
the  mapping  which  assigns  to  any  given  initial  vector  of 
gossip  variables  the  vector  of  gossip  variables  which  results 
from  the  multigossips  in  the  sequence.  Clearly  any  such 
matrix  product  is  also  a  product  of  single-gossip  matrices 
and  thus  is  a  bona  fide  gossip  matrix. 

Extending  the  concept  of  nonredundancy,  we  say  that  a 
multigossip  sequence  is  nonredundant  if  no  individual 
gossip  occurs  in  more  than  one  multigossip  in  the  se¬ 
quence.  Nonredundant  multigossip  sequences  are  clearly 
of  finite  length  in  that  the  length  of  each  is  no  larger  than 
the  number  of  edges  of  N.  The  graph  induced  by  a  multi¬ 
gossip  sequence  E,  written  Ns,  is  the  spanning  subgraph 
of  N  whose  edges  correspond  to  all  of  the  gossips  in  all  of 
the  multigossips  in  the  sequence.  A  multigossip  sequence 

*A  gossip  sequence  is  nonredundant  if  each  gossip  in  the  sequence 
occurs  at  most  once. 
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is  complete  if  the  graph  Nj  which  it  induces  is  a  connected 
spanning  subgraph  of  N.  E  is  minimally  complete  if  it  is 
complete  and  if  the  sum  total  of  all  the  single  gossips  in  all 
the  multigossips  in  the  sequence  is  no  larger  than  the  sum 
total  of  all  the  single  gossips  in  all  the  multigossips  in  any 
other  complete  multigossip  sequence.  It  is  clear  that  if  E  is 
minimally  complete,  then  the  subgraph  Ns  it  induces 
must  be  a  minimal  spanning  tree.  On  the  other  hand,  if  E 
is  nonredundant  and  Ns  is  a  minimal  spanning  tree  of  N, 
then  E  must  be  minimally  complete. 

C.  Convergence 

Roughly  speaking,  if  over  a  period  T  a  complete  multi¬ 
gossip  sequence  has  occurred,  then  each  agent  in  the  group 
will  have  been  “in  touch”  with  each  other  agent  at  least 
indirectly.  It  is  not  surprising  then  that  complete  multi¬ 
gossip  subsequences  over  successive  periods  in  an  infinitely 
long  sequence  should  be  sufficient  for  all  gossip  variables  in 
a  gossiping  process  to  converge  to  a  common  value. 
Prompted  by  this,  let  us  call  an  infinite  sequence  of  multi¬ 
gossips  repetitively  complete  with  period  T  if  each  successive 
subsequence  of  multigossips  of  length  T  in  the  sequence  is 
complete.  The  following  theorem  implies  that  repetitive 
complete  multigossip  sequences  converge  exponentially 
fast. 

Theorem  1:  Let  M(l), M(2), M(3) ,  .  .  .  denote  the 
gossiping  matrices  corresponding  to  an  infinite  sequence 
of  multigossips  which  is  repetitively  complete  with 
period  T.  Suppose  that  the  vector  of  gossip  variables  x(t) 
evolves  according  to  x(t  +  1)  =  M(t)x(t),  t  >  1.  There 
exists  a  real  nonnegative  number  A  <  1  such  that  for  each 
initial  value  of  x(l),  all  n  gossip  variables  converge  to  the 
average  value 


as  fast  as  Af  converges  to  zero. 

There  are  several  different  ways  to  prove  this  theorem; 
see,  for  example,  [3],  [6],  [7],  and  [23].  It  turns  out  that 
this  theorem  is  a  simple  consequence  of  more  technical 
results  which  are  of  interest  in  their  own  right  and  which 
will  be  stated  and  proved  in  Section  IV. 

The  theorem  also  applies  to  more  general  gossiping 
algorithms  in  which  simple  averaging  between  agent  pairs 
is  replaced  with  averaging  based  on  convex  combinations. 
All  that  is  required  is  that  the  averaging  rule  determines 
doubly  stochastic  matrices  M(-)  which  take  values  in  a 
compact  set. 

D.  Tree  Graphs 

In  graph  theory,  tree  graphs  (i.e.,  graphs  without 
cycles)  often  lead  to  significant  simplifications.  This  is  also 


the  case  with  gossiping.  Let  us  note  that  a  tree  graph  has 
the  property  that  removal  of  any  one  edge  (i,f)  results  in  a 
disconnected  graph.2  This  means  that  if  N  is  a  tree,  a 
necessary  condition  for  a  finite  sequence  of  gossips  to  be 
complete  is  that  over  the  period  during  which  the  gossips 
occur,  each  agent  must  gossip  with  each  of  its  neighbors  at 
least  once.  It  is  clear  that  the  converse  is  also  true;  i.e.,  if 
each  agent  gossips  with  each  of  its  neighbors  at  least  once 
during  a  given  period,  then  the  sequence  of  gossips  which 
took  place  over  that  period  must  be  complete.  It  is  easy  to 
see  that  in  the  case  when  N  is  a  tree,  a  gossip  sequence  is 
minimally  complete  if  and  only  if  a  gossip  between  each 
agent  and  each  of  its  neighbors  occurs  exactly  once  in  the 
sequence.  Equivalently  a  gossip  matrix  G  for  a  graph  N, 
which  is  a  tree,  is  minimally  complete  if  and  only  if  G  is 
a  product  of  all  of  the  single-gossip  matrices  associated 
with  N. 


III.  PERIODIC  GOSSIPING 

About  the  easiest  way  to  guarantee  a  repetitively  complete 
multigossip  sequence  is  to  use  a  protocol  which  generates 
an  infinite  multigossip  sequence  which  on  the  one  hand  is 
“periodic”  and  on  the  other  is  complete  on  each  successive 
period.  Prompted  by  this,  let  us  agree  to  call  an  infinite 
sequence  of  multigossips  periodic  with  period  T  if  each 
multigossip  in  the  sequence  occurs  once  every  T  time 
units;  such  a  sequence  is  periodically  complete  if  each  sub¬ 
sequence  consisting  of  T  consecutive  multigossips  is  com¬ 
plete.  It  is  clear  that  any  periodically  complete  multigossip 
sequence  is  repetitively  complete.  The  converse  of  course 
is  not  true. 

A.  Convergence  Rate 

Corresponding  to  any  T-periodic  sequence  of  multi¬ 
gossips  is  an  infinite  sequence  of  gossip  matrices;  such  a 
matrix  sequence  is  periodic  with  period  T  in  that  each  ma¬ 
trix  within  the  sequence  repeats  itself  every  T  time  units. 
Suppose  that  M(l) ,  M(2) , .  .  .  is  such  a  T-periodic  sequence. 
If  x(t  +  l)  =  M(t)x(t),  t  >  1,  it  is  clear  that  x((i  +  l)T+ 
1)  =  Nx(iT  +  1),  i  >  0,  where  N  =  M(T)M(T  —  1)  •  •  • 
M(l).  Thus,  x(iT  +  l)  =  N‘x(l),  i  >  0,  which  means  that 
both  the  convergence  and  convergence  rate  of  the  gossip 
variables  in  a  periodic  multigossip  sequence  are  completely 
determined  by  properties  of  N.  Note  that  N  is  a  doubly 
stochastic  matrix  because  each  of  the  matrices  in  the  pro¬ 
duct  defining  it  is  doubly  stochastic,  and  because  the  class 
of  n  X  n  doubly  stochastic  matrices  is  closed  under  multi¬ 
plication.  Now  because  N  is  stochastic,  it  has  an  eigenvalue 
at  1  and  its  spectral  radius  is  1  [27].  We  are  interested  in  the 
case  when  lim^oc  N'  =  (\/n)\V  which  is  clearly  just 
when  all  eigenvalues  other  than  the  one  eigenvalue  with 

2If  this  were  not  so  then  there  would  have  to  be  at  least  two  distinct 
paths  between  i  and  j  which  contradicts  the  requirement  that  a  tree  be 
acyclic. 
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value  1  have  magnitudes  strictly  less  than  one.  This  is 
precisely  the  property  of  a  complete  gossip  matrix. 

Theorem  2:  A  gossip  matrix  is  complete  if  and  only  if  the 
magnitudes  of  all  of  its  eigenvalues,  with  the  exception  of  a 
single  eigenvalue  of  value  1,  are  strictly  less  than  1. 

A  proof  of  this  theorem  will  be  given  in  Section  IV-D. 

For  any  doubly  stochastic  matrix  S,  let  p(S )  denote  the 
magnitude  of  the  second  largest  eigenvalue  (in  magnitude) 
of  S.  It  is  clear  that  the  rate  at  which  x  converges  to  lyavg  is 
p1^T(N).  In  the  case  when  N  is  a  tree,  p(N )  turns  out  to  be 
the  same  for  all  minimally  complete  gossip  matrices 
determined  by  N.  This  somewhat  surprising  fact  is  a  direct 
consequence  of  the  following  theorem  which  is  the  main 
result  of  [25]. 

Theorem  3:  Let  £  =  ei,  e2,  ■  ■  ■ ,  ej.  be  the  sequence  of 
edges  of  N  labeling  a  nonredundant,  complete  gossip  se¬ 
quence.  LetNg  denote  the  spanning  subgraph  of  N  whose 
edges  are  the  edges  in  £.  Let  G{£)  be  the  group  (under 
composition)  consisting  of  the  identity  map  on  {1,  2, .  .  . , 
fe}  together  with  all  maps  7t  :  {1,  2, .  .  . ,  fe}  — *  {l,  2, . . . ,  fe} 
generated  by  all  permutations  which  satisfy  one  of  the 
following  conditions: 

1)  7t  is  a  cyclic  permutation  of  {1, 2, .  . . ,  fe}; 

2)  7t  is  that  permutation  of  { 1,  2, . .  . ,  fe}  which  for 
some  i  <  fe,  interchanges  i  and  i  +  1  provided  that 
in  Ng,  either 

a)  et  and  e;+i  are  not  incident  on  the  same 
vertex  or 

b)  e,  and  e,+i  are  incident  on  the  same  vertex  but 
neither  edge  is  contained  in  any  cycle  of  Ng. 

For  each  7t  €  G(£),  let  Gn  denote  the  gossip  matrix  induced 
by  the  edge  sequence  ,  e^(2)  j  •  •  ■ ,  e„.(u  ■  Then,  the 
characteristic  polynomial  of  Gw  is  the  same  for  all  7t  €  G(£)  ■ 

A  proof  of  this  theorem  can  be  found  in  [25].  See  also 
[21]  for  an  alternative  proof. 

Note  that  if  Ng  is  a  tree,  then  condition  2  in  Theorem  3 
implies  that  for  any  two  successive  integers  i  and  i  +  1  in 
{l,  2, .  .  . ,  fe},  there  is  a  permutation  in  G{£)  which  inter¬ 
changes  i  and  i  +  1.  A  simple  induction  thus  proves  that  if 
Ng  is  a  tree,  G(£)  is  the  set  of  all  permutations  on  {1,2, 
.  .  .  ,  fe}.  Therefore,  in  this  case,  the  characteristic  polyno¬ 
mial  of  Gn  is  the  same  for  all  permutations  of  { 1,  2,  .  .  .  ,  fe}. 

Suppose  that  N  is  a  tree.  Then,  because  of  complete¬ 
ness,  Ng  =  N,  and  therefore,  G(£)  is  the  set  of  all  per¬ 
mutations  on  {1,  2,  .  .  .  ,  fe}.  Thus,  if  N  is  a  tree,  Theorem  3 
implies  that  p(N)  is  the  same  for  all  minimal  complete 
gossip  matrices  determined  by  N.  This  conclusion  is  not 
implied  by  Theorem  3  if  N  is  not  a  tree. 

As  stated,  the  theorem  is  only  for  sequences  of  single 
gossips.  However,  the  same  theorem  also  applies,  with 
virtually  the  same  proof,  to  sequences  of  multigossips.  This 
is  because  for  purposes  of  analysis,  any  multigossip  can  be 
viewed  as  a  sequence  of  noninteracting  single  gossips 
arranged  in  any  order. 


B.  Multigossiping 

It  is  clear  from  the  preceding  that  the  rate  at  which  the 
gossip  variables  of  a  periodically  complete  gossiping  se¬ 
quence  converge  depends  not  only  on  p(N)  but  also  on  T. 
For  example,  suppose  that  71,  72,  73, 74,  ■  ■  - ,  7 t,  7h  72;  •  ■  ■ 
is  an  infinite  periodically  complete  gossip  sequence  with 
period  T.  Suppose  in  addition  that  71,  72,  73  are  noninter¬ 
acting  gossips.  Then,  these  three  gossips  might  be  executed 
simultaneously  as  a  multigossip  {71,  72,  73},  rather  than 
sequentially,  at  the  beginning  of  each  period  without  in 
any  way  affecting  the  complete  gossip  matrix  N  corre¬ 
sponding  to  the  original  subsequence  71,  72,  73, 74, ...  ,  Jt- 
In  other  words,  rather  than  executing  the  T-periodic  se¬ 
quence  71,  72,  73, 74, ... ,  7t,  71,  72,  ■  • the  group  could 
execute  the  periodic  sequence  {71,  72,  73},  74, .  .  . ,  7r,  {71, 
72,  73},  74, .  .  .  without  changing  the  value  of  p(N).  The 
key  point  here  is  that  this  sequence  has  period  T  —  2  rather 
than  T.  Thus,  by  using  multigossiping,  the  worst  case 
convergence  rate  for  this  gossiping  process  would  be 
reduced  from  p1/T(N)  to  p1/ (T~2)  (N) .  It  is  obvious  that,  in 
general,  to  get  faster  convergence,  one  would  want  to 
implement  multigossiping  sequences  using  the  smallest 
number  of  distinct  multigossips  possible.  For  the  case 
when  N  is  a  tree  and  the  original  subsequence 
71, 72, 73,  74,  •  •  ■ ,  7t  is  minimally  complete,  we  know 
that  the  order  of  the  gossips  in  the  sequence  can  be 
changed  without  changing  p(N).  In  this  case,  the  minimal 
number  of  multigossips  needed  to  implement  the  original 
sequence  would  be  the  same  as  the  minimal  number  of 
colors  needed  to  color  the  edges  of  N  subject  to  the 
constraint  that  no  two  edges  incident  on  any  vertex  have 
the  same  color,  for  edges  of  the  same  color  would  then 
correspond  to  those  gossips  which  could  be  implemented 
together  as  a  single  multigossip.  Edge  coloring  is  a  basic 
problem  in  graph  theory  [28].  The  minimal  number  of 
colors  required  to  color  a  graph  subject  to  this  constraint  is 
called  the  chromatic  index.  Vizing’s  theorem  states  that  the 
chromatic  index  of  a  neighbor  graph  N  is  either  d  or  d  +  1 
where  d  is  the  maximum  vertex  degree  of  N  [29]. 
Moreover,  if  N  is  a  tree,  the  chromatic  index  is  d  because 
of  Konig’s  theorem  [28].  In  other  words,  if  N  is  a  tree  with 
maximum  vertex  degree  d,  it  is  possible  to  construct  a 
periodic  sequence  of  multigossips  with  period  T  =  d 
which  converges  as  fast  as  the  sequence  pl'd(N) , 
p2/d(N),p3/d(N), ...  converges  to  zero  where  N  is  any 
minimally  complete  gossip  matrix  for  the  graph. 

IV.  REQUEST-BASED  GOSSIPING 

Request-based  gossiping  is  a  gossiping  process  in  which  a 
gossip  occurs  between  two  agents  whenever  one  of  the  two 
accepts  a  request  to  gossip  placed  by  the  other.  The  aim  of 
this  section  is  to  discuss  this  process. 

In  a  request-based  gossiping  process,  a  given  agent  i 
may  gossip  with  one  of  its  neighbor’s  at  time  t  only  if  t  is 
either  an  “event  time”  of  agent  i  or  an  “event  time”  of  its 
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neighbor  which  has  made  a  request  to  gossip  with  agent  i. 
By  an  event  time  of  agent  i  is  meant  a  time  at  which  agent  i 
may  place  a  request  to  gossip  with  one  of  its  neighbors.  By 
an  event  time  interval  of  agent  i  is  meant  the  interval  of  time 
between  two  successive  event  times  of  agent  i.  For  obvious 
reasons,  we  assume  that  the  lengths  of  agent  i’s  event  time 
intervals  are  all  bounded  above  by  a  finite  positive  number 
Tj.  We  write  T ;  for  the  set  of  event  times  of  agent  i  and  T 
for  the  union  of  the  event  time  sequences  of  all  n  agents. 

Conflicts  leading  to  deadlocks  can  arise  if  an  agent  who 
has  placed  a  request  to  gossip,  at  the  same  time  receives  a 
request  to  gossip  from  another  agent.  It  is  challenging  to 
devise  rules  which  resolve  such  conflicts  while  at  the  same 
time  ensuring  exponential  convergence  of  the  gossiping 
process.  One  way  to  avoid  such  conflicts  is  to  assign  event 
times  offline  so  that  no  agent  can  receive  a  request  to 
gossip  at  any  of  its  own  event  times.  There  are  several  ways 
to  do  this  which  will  be  discussed  below. 

From  time  to  time,  agent  i  may  have  more  than  one 
neighbor  to  which  it  might  be  able  to  make  a  request  to 
gossip  with.  Also  from  time  to  time,  agent  i  may  receive 
more  than  one  request  to  gossip.  While  in  such  situations 
decisions  about  who  to  place  requests  with  or  whose  re¬ 
quest  to  accept  can  be  randomized,  in  this  paper,  we  will 
examine  only  completely  deterministic  strategies.  To  do 
this  we  will  assume  that  each  agent  has  ordered  its  neigh¬ 
bors  in  Afi  according  to  some  priorities  so  when  a  choice 
occurs  between  neighbors,  agent  i  will  always  choose  the 
one  with  highest  priority. 

Consider  first  the  situation  when  the  event  times  of 
each  agent  and  each  agent’s  neighbor  priorities  are  chosen 
offline  and  are  fixed  throughout  the  gossiping  process. 
Assume  that  the  event  times  are  chosen  so  that  no  agent 
can  receive  a  request  to  gossip  at  any  of  its  own  event 
times.  Our  aim  is  to  show  that  this  arrangement  can  be 
problematic.  The  following  protocol  illustrates  this. 

Protocol  I:  At  each  event  time  t  £  T  the  following  rules 
apply  for  each  i  €  {l,  2, .  . . ,  n}. 

1)  If  t  £  Tj,  agent  i  places  a  request  to  gossip  with 
that  neighbor  whose  priority  is  the  highest. 

2)  If  t  T,  agent  i  does  not  place  a  request  to 
gossip. 

3)  Each  agent  i  receiving  one  or  more  requests  to 
gossip  must  gossip  with  that  requesting  neighbor 
whose  priority  is  the  highest. 

4)  If  t  ^  Tj  and  agent  i  does  not  receive  a  request  to 
gossip,  it  does  not  gossip. 

The  following  example  shows  that  this  simple  strategy 
will  not  necessarily  lead  to  a  consensus.  Suppose  that  N  is 
a  path  graph  with  edges  (a,  b),  (b,  c),  (c,  d).  Assume  that 
agents  a  and  b  have  distinct  event  times  and  that  agents  n 
and  c  have  the  same  event  times  as  do  agents  b  and  d;  note 
that  this  guarantees  that  no  agent  can  receive  a  request  to 
gossip  at  any  of  its  own  event  times.  To  avoid  ambiguities 
in  decision  making,  suppose  that  agent  b  assigns  a  higher 


priority  to  a  than  to  c  and  agent  c  assigns  a  higher  priority 
to  d  than  to  b.  Let  t  be  an  event  time  of  agents  a  and  c. 
Then,  at  this  time  a  places  a  request  to  gossip  with  b  and  c 
places  a  request  to  gossip  with  d.  Since  b  and  d  receive  no 
other  requests,  gossips  take  place  between  a  and  b  and 
between  c  and  d.  Alternatively,  if  t  is  an  event  time  of 
agents  b  and  d,  then  at  this  time,  b  places  a  request  to 
gossip  with  a  and  d  places  a  request  to  gossip  with  c.  Since 
a  and  c  receive  no  other  requests,  gossips  again  take  place 
between  a  and  b  and  between  c  and  d.  Thus,  under  no 
conditions  is  there  ever  a  gossip  between  b  and  c,  so  the 
gossiping  process  will  never  reach  a  consensus.  The  reader 
may  wish  to  verify  that  simply  changing  the  priorities  will 
not  rectify  this  situation:  For  any  choice  of  priorities,  there 
will  always  be  at  least  one  gossip  needed  to  reach  a  con¬ 
sensus,  which  will  not  take  place. 

The  preceding  example  illustrates  that  fixed  priorities 
can  present  problems.  In  what  follows  we  take  an  alter¬ 
native  approach. 

In  the  light  of  Theorem  1  it  is  of  interest  to  consider 
gossiping  protocols  which  generate  repetitively  complete 
gossip  sequences.  Towards  this  end,  let  us  agree  to  say  that 
an  agent  i  has  completed  a  round  of  gossiping  after  it  has 
gossiped  with  each  neighbor  in  A/"j  at  least  once.  Thus,  a 
finite  gossiping  sequence  for  the  entire  group  which  has 
occurred  over  an  interval  of  length  T  will  be  complete  if 
over  the  same  period  each  agent  in  the  group  completes  a 
round.  In  fact  in  the  case  when  N  is  a  tree,  the  only  way 
such  a  sequence  could  be  complete  is  if  over  the  same 
period  each  agent  in  the  group  completes  a  round. 

For  the  protocols  which  follow  it  will  be  necessary  for 
each  agent  i  to  keep  track  of  where  it  is  in  a  particular 
round.  To  do  this,  agent  i  makes  use  of  a  recursively  up¬ 
dated  neighbor  queue  qj(t)  where  q^-)  is  a  function  from  T 
to  the  set  of  all  possible  lists  of  the  rrij  labels  in  J\ft,  the 
neighbor  set  of  agent  i.  Roughly  speaking,  q;(t)  is  a  list  of 
the  labels  of  the  neighbors  of  agent  i  at  time  t  which 
defines  the  queue  of  neighbors  at  time  t  which  are  in  line 
to  gossip  with  agent  i.  The  updating  of  q;  (t)  is  straight¬ 
forward:  If  neighbor  j  gossips  with  agent  i  at  time  t,  the 
updated  queue  qf(t  +  1)  is  obtained  by  moving  agent  j’s 
label  from  its  current  position  in  q;  (t) ,  to  the  end  of  the 
queue.  If  on  the  other  hand,  agent  i  does  not  gossip  at  time 
t,  q;(t+  1)  =  q;(t). 

A.  Protocols 

As  noted  earlier,  it  is  helpful  to  have  event  time  as¬ 
signments  which  guarantee  that  no  agent  can  receive  a 
request  to  gossip  at  any  of  its  own  event  times.  One  easy 
way  to  accomplish  this  is  to  use  event  time  assignments 
which  satisfy  the  following  assumption. 

Distinct  Event  Times  Assumption:  For  each  distinct  pair  of 
integers  i  and  j  in  {1, 2, .  . . ,  n},  T;  and  Tj  are  disjoint  sets. 

Note  that  the  assumption  implies  that  at  any  fixed 
event  time  in  T,  only  one  agent  can  receive  a  request  to 
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gossip,  and  moreover,  that  this  agent  will  receive  exactly 
one  such  request.  A  simple  protocol  which  ensures  expo¬ 
nential  convergence  under  this  assumption  is  as  follows. 

Protocol  II:  Suppose  that  the  distinct  event  times  as¬ 
sumption  holds.  At  each  event  time  t  £  T ,  the  following 
rules  apply  for  each  i  €  {1, 2, .  . . ,  n}. 

1)  If  t  £  T i,  agent  i  places  a  request  to  gossip  with 
that  neighbor  whose  label  is  in  the  front  of  the 
queue  q,  (t) . 

2)  If  t^7"j,  agent  i  does  not  place  a  request  to 
gossip. 

3)  Each  agent  i  receiving  a  request  to  gossip  must 
gossip  with  the  neighbor  placing  the  request. 

4)  If  t  ^  Ti  and  agent  i  does  not  receive  a  request  to 
gossip,  it  does  not  gossip. 

The  behavior  of  Protocol  II  can  be  easily  explained  as  fol¬ 
lows.  For  i  £  {1, 2,  .  .  .  ,  n},  let  d;  be  the  number  of  neigh¬ 
bors  of  agent  i,  or  equivalently,  the  degree  of  vertex  i  in  M. 
It  is  clear  that  with  the  distinct  event  times  protocol,  each 
agent  i  will  complete  a  round  in  a  time  interval  containing 
no  more  than  d ;  event  time  intervals  of  agent  i.  The  length 
of  the  time  interval  large  enough  to  contain  dt  successive 
event  time  intervals  of  agent  i  for  all  i  £  { 1,  2, .  .  . ,  n}  is  the 
maximum  of  the  times  Ttdi,  i£  {1,2,...,  n} .  Thus,  the 
sequence  of  gossips  which  occur  on  an  interval  of  this 
length  must  necessarily  be  complete.  We  have  proved  the 
following. 

Proposition  1:  Suppose  that  the  distinct  event  times 
assumption  holds  and  that  all  agents  in  the  group  adhere  to 
Protocol  II.  Then,  the  infinite  sequence  of  gossips  gene¬ 
rated  will  be  repetitively  complete  with  period 

T  =  max(diTi). 

i 

One  shortcoming  of  Protocol  II  is  that  it  does  not  allow 
for  multigossiping.  Another  is  that  the  distinct  event  times 
assumption  on  which  the  protocol  depends  is  somewhat 
stringent.  It  is  possible  to  relax  this  assumption  and  still 
devise  a  protocol  which  ensures  exponential  convergence. 
The  relaxed  assumption  is  as  follows. 

Distinct  Neighbor  Event  Times  Assumption:  For  each  i  £ 
{ 1,  2, . . . ,  n}  and  each  j  £  J\f  i,  T ;  and  Tj  are  disjoint  sets. 

Thus,  if  this  assumption  holds,  the  event  times  of  each 
agent  are  distinct  from  the  event  times  of  all  of  its  neigh¬ 
bors.  The  assignment  of  event  times  which  satisfy  this 
assumption  is  mathematically  identical  to  the  classic 
“vertex  coloring  problem”  from  graph  theory  [28].  Note 
that  the  distinct  neighbor  event  times  assumption  stipu¬ 
lates  that  no  two  adjacent  vertices  on  the  neighbor  graph 
N  can  have  the  same  event  times.  The  rule  defining  vertex 
coloring  of  N  stipulates  that  no  two  adjacent  vertices  can 
have  the  same  color.  The  least  number  of  different  colors 


required  to  vertex  color  N  is  called  the  chromatic  number 
of  N  [28].  Brooks’  theorem  states  that  this  number  is 
bounded  above  by  the  maximum  degree  of  N,  except  for 
complete  graphs  and  for  graphs  with  cycles  of  odd  length 
in  which  cases  the  bound  is  one  plus  the  maximum  degree 
of  N  [30].  Thus,  in  all  cases  the  largest  number  of  distinct 
event  time  sequences  which  would  need  to  be  assigned 
to  N  to  satisfy  the  distinct  neighbor  event  times  assump¬ 
tion  is  no  greater  than  one  plus  the  maximum  vertex 
degree  of  N. 

Under  the  distinct  neighbor  event  times  assumption,  it 
is  possible  to  ensure  exponential  convergence  with  the 
following  protocol  which  is  a  refinement  of  Protocol  II. 

Protocol  III:  Suppose  that  the  distinct  neighbor  event 
times  assumption  holds.  At  each  event  time  t  £  T  the 
following  rules  apply  for  each  i  £  { 1,  2,  .  .  .  ,  n}. 

1)  If  t  £  Ti,  agent  i  places  a  request  to  gossip  with 
that  neighbor  whose  label  is  at  the  front  of  the 
queue  q,(t). 

2)  If  t  ^  T'i,  agent  i  does  not  place  a  request  to 
gossip. 

3)  Each  agent  i  receiving  one  or  more  requests  to 
gossip  must  gossip  with  that  requesting  neighbor 
whose  label  is  closest  to  the  front  of  the  queue 

QiW- 

4)  If  t  ^  Ti  and  agent  i  does  not  receive  a  request  to 
gossip,  it  does  not  gossip. 

Just  as  with  the  distinct  event  times  protocol,  it  is 
possible  to  derive  a  worst  case  bound  on  the  time  it  takes 
for  all  n  agents  to  complete  a  round  of  gossiping.  As  a  first 
step  towards  this  end,  fix  i  and  suppose  that  at  some  given 
event  time  to  £  T  j  is  the  leading  label  in  the  queue 
q;(tQ).  According  to  the  preceding  protocol,  from  this 
event  time  forward  agent  i  must  repeatedly  place  requests 
with  agent  j  to  gossip  at  successive  event  times  in  T until 
gossiping  between  the  two  takes  place.  Meanwhile,  at 
these  same  event  times,  neighbor  j  will  be  receiving  re¬ 
quests  to  gossip  from  neighbor  i  and  possible  some  other 
neighbors.  In  the  worst  case,  when  label  i  is  at  the  end  of 
q((to)  at  time  t0,  it  will  take  at  most  d(-  event  time  intervals 
of  agent  i  for  label  i  to  advance  to  the  front  of  agent  j’s 
queue.  This  means  that  agents  i  and  j  are  guaranteed  to 
gossip  at  least  once  within  any  time  interval  containing  no 
more  than  d(  event  time  intervals  of  agent  i.  If  agent  i  has 
only  one  neighbor,  then  the  round  is  complete  in  at  most 
Ttdj  time  units.  On  the  other  hand,  if  agent  i  has  more  than 
one  neighbor,  agent  i  then  begins  to  place  requests  to 
gossip  with  the  agent  whose  label  k  was  second  from  the 
front  in  the  queue  qj(t0).  But  at  this  time  label  i  might  be, 
in  the  worst  case,  at  the  end  of  the  queue  for  agent  k.  By 
the  same  reasoning  as  before,  it  will  take  at  worst  an  ad¬ 
ditional  dk  successive  event  time  intervals  of  agent  i  for 
agents  i  and  k  to  gossip.  In  other  words,  agent  i  is  gua¬ 
ranteed  to  have  gossiped  at  least  once  with  both  agent  j  and 
agent  k  within  any  time  interval  containing  no  more  than 
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dj  +  dfc  event  time  intervals  of  agent  i.  By  repeating  this 
argument  for  all  labels  in  the  queue  cjj(to),  one  reaches  the 
conclusion  that  agent  i  is  guaranteed  to  complete  a  round 
of  gossiping  with  all  of  its  neighbors  in  any  time  interval 
containing  XvjeAT  event  time  intervals  of  agent  i.  An 
upper  bound  on  the  length  of  any  such  interval  is  thus 
Ti  SjeA/'i  dj- 

From  the  preceding  it  is  clear  that  within  any  interval 
of  time  containing  either  YljeJV-  d i  event  times  of  agent  i 
or  XyjcAh  dj  event  time  intervals  of  agent  fe,  neighbors  i 
and  fe  will  gossip  at  least  once;  thus  neighbors  i  and  fe 
will  gossip  at  least  once  in  any  time  interval  of  length 
min{T;  Tk  ]C;e,Vfe  dj}-  The  maximum  of  this 

amount  of  time  over  all  agent  pairs  is  thus  an  upper 
bound  on  the  amount  of  time  it  takes  for  all  neighbor  pairs 
to  gossip  at  least  once.  But  completeness  of  a  gossip  se¬ 
quence  is  assured  if  the  sequence  contains  a  gossip  for  each 
possible  neighbor  pair.  We  have  therefore  proved  the  fol¬ 
lowing  proposition. 

Proposition  2:  Let  £  denote  the  set  of  all  edges  (i,  j)  in  N. 
Suppose  that  the  distinct  neighbor  event  times  assump¬ 
tion  holds  and  that  all  agents  in  the  group  adhere  to 
Protocol  III.  Then,  the  infinite  sequence  of  gossips  gene¬ 
rated  will  be  repetitively  complete  with  period 


T  =  max  min 

(i,k)e£ 


TiEdp  T*Ed» 

jeJV,  jeATk 


A  disadvantage  of  Protocol  III  is  that  it  requires  the 
distinct  neighbor  event  times  assumption.  This  assumption 
can  only  be  satisfied  by  offline  assignment  of  event  times 
for  each  agent,  and  in  some  applications  such  an  offline 
assignment  may  be  undesirable.  In  a  recent  doctoral  thesis 
[13],  a  clever  gossiping  protocol  is  proposed  which  does 
not  require  the  distinct  neighbor  event  times  assumption. 
The  protocol  avoids  deadlocks  and  achieves  consensus  ex¬ 
ponentially  fast.  A  disadvantage  of  this  protocol  is  that  it 
requires  each  agent  to  obtain  the  values  of  all  of  its  neigh¬ 
bors’  gossip  variables  at  each  clock  time.  Thus,  if  commu¬ 
nication  cost  is  an  important  issue,  this  protocol  may  not 
be  satisfactory  even  though  only  local  information  is  re¬ 
quired.  By  exploiting  one  of  the  key  ideas  in  [13]  together 
with  the  notion  of  an  agent’s  neighbor  queue  q,(t)  de¬ 
fined  earlier,  it  is  possible  to  obtain  a  gossiping  protocol 
which  also  avoids  deadlocks  and  achieves  consensus  ex¬ 
ponentially  fast  but  without  requiring  each  agent  to  ob¬ 
tain  the  value  of  more  than  one  of  its  neighbors’  gossip 
variables  at  each  clock  time.  Our  aim  below  is  to  outline 
this  protocol. 

Protocol  IV:  In  the  following,  agent  i’s  preferred  neighbor 
at  time  t  is  that  agent  whose  label  i*(t)  is  in  the  front  of  the 


queue  q;  (t) .  Between  clock  times  t  and  t  +  1  each  agent  i 
performs  the  steps  enumerated  below  in  the  order  indi¬ 
cated.  Although  the  agents’  actions  need  not  be  precisely 
synchronized,  it  is  understood  that  for  each  fe  €  {1,2,3} 
all  agents  complete  step  fe  before  any  embark  on  step 

fe  +  1. 

1)  First  Transmission:  Agent  i  places  a  request  to 
gossip  with  its  preferred  neighbor  by  sending  both 
its  label  i  and  its  gossip  value  Xj(t)  to  agent  i*(t). 
At  the  same  time  agent  i  receives  requests  to 
gossip  (i.e.,  the  labels  and  corresponding  gossip 
values)  from  all  of  those  neighbors  which  have 
agent  i  as  their  current  preferred  neighbor.  Let 
IZi  (t)  denote  the  set  of  labels  of  these  requesting 
neighbors. 

2)  Second  Transmission:  Agent  i  sends  its  current 
gossip  value  x;(t)  to  those  neighbors  which  have 
agent  i  as  their  current  preferred  neighbor,  namely 
the  neighbors  of  agent  i  with  labels  in  TZ{  (t). 

3)  Acceptances: 

a)  If  agent  i  has  not  placed  a  request  to  gossip 
but  has  received  at  least  one  request  to  gos¬ 
sip,  then  agent  i  sends  an  acceptance  to  that 
particular  requesting  neighbor  whose  label  is 
closest  to  the  front  of  the  queue  q;(t). 

b)  If  agent  i  has  either  placed  a  request  to  gossip 
or  has  not  received  any  requests  to  gossip, 
then  agent  i  does  not  send  out  an  acceptance. 

4)  Gossip  variable  and  queue  updates: 

a)  If  agent  i  either  sends  an  acceptance  to  or 
receives  an  acceptance  from  neighbor  j,  then 
agent  i  gossips  with  neighbor  j  by  setting 


Xi(t  +  l) 


Xi(t)  +  Xj(t) 

2 


Agent  i  updates  its  queue  by  moving  j,  and 
any  labels  fe  €  {i* (t) }  U  lZi(t)  for  which 
xi ,(t)  =  Xj(t)  from  their  current  positions  in 
q;  (t)  to  the  end  of  the  queue  while  main¬ 
taining  their  relative  order, 
b)  If  agent  i  has  not  sent  out  an  acceptance  or 
received  one,  then  agent  i  does  not  update 
the  value  of  Xj(t).  In  addition,  q;(t)  is  updated 
by  moving  any  labels  fe  €  |i*(t)}  U  7Zi(t)  for 
which  Xfe(t)  =  X;(t)  from  their  current  posi¬ 
tions  in  q,  (t)  to  the  end  of  the  queue  while 
maintaining  their  relative  order. 

It  is  possible  to  show  that  every  gossip  sequence  generated 
by  this  protocol  is  repetitively  complete  with  period  no 
greater  than  the  number  of  edges  of  N  [31],  It  follows  from 
Theorem  1  that  any  sequence  of  gossip  vectors  generated 
by  this  protocol  is  exponentially  convergent. 
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1)  Convergence  Rate:  Recall  that  a  gossiping  sequence  is 
repetitively  complete  with  period  T  if  each  successive  sub¬ 
sequence  of  gossips  of  length  T  in  the  sequence  is 
complete;  and  if,  in  addition,  each  gossip  in  the  sequence 
repeats  once  every  T  time  units,  the  sequence  is  periodic 
with  period  T.  As  was  noted  in  Section  III-A,  for  a  repeti¬ 
tively  complete  sequence  of  gossiping  matrices  M(l), 
M( 2), .  .  .  which  is  T-periodic,  the  convergence  rate  of  the 
product  M(T)M(T  —  l)  •  •  •  M(l)  as  t  — >  oo  is  determined 
by  T  and  by  the  eigenvalue  of  N  =  M(T)M(T  —  l)  •  •  •  M(l) 
which  is  second  largest  in  magnitude.  For  gossiping  se¬ 
quences  which  are  repetitively  complete  but  not  periodic 
this  is  no  longer  true.  Such  sequences  are  closely  related  to 
what  are  called  “nonhomogeneous  Markov  chains”  for 
which  there  is  a  substantial  literature  [26].  Notwithstand¬ 
ing  this,  the  following  question  remains.  What  determines 
the  convergence  rate  of  a  repetitively  complete  gossip  se¬ 
quence  which  is  not  necessarily  periodic?  This  is  the  ques¬ 
tion  which  will  be  considered  next.  We  will  tackle  the 
question  in  two  steps.  First,  in  Section  IV-B,  we  will  dis¬ 
cuss  certain  relevant  basic  properties  of  stochastic  ma¬ 
trices.  Then,  in  Section  IV-C,  we  will  study  several  types  of 
“seminorms”  appropriate  to  the  analysis  of  nonhomoge¬ 
neous  Markov  chains.  Finally,  in  Section  IV-D,  we  will 
show  that  a  certain  seminorm  provides  exactly  what  is 
needed  to  characterize  the  convergence  rate  of  a  repeti¬ 
tively  complete  gossip  sequence. 

B.  Stochastic  Matrices 

Since  gossip  matrices  are  stochastic  matrices,  a  natural 
starting  point  for  the  study  of  convergence  rates  of  gossip¬ 
ing  sequences  is  a  review  of  some  of  the  basic  concepts 
associated  with  stochastic  matrices.  We  begin  with  the  idea 
of  a  graph  of  a  stochastic  matrix. 

1)  Graph  of  a  Stochastic  Matrix:  Many  properties  of  a 
stochastic  matrix  can  be  usefully  described  in  terms  of  an 
associated  directed  graph  determined  by  the  matrix.  The 
graph  of  nonnegative  matrix  M  €  Rnxn,  written  7 (M),  is  a 
directed  graph  on  n  vertices  with  an  arc  from  vertex  i  to 
vertex  j  just  in  case  mp  7^  0;  if  (i,  j)  is  such  an  arc,  we  say 
that  i  is  a  neighbor  of  j  and  that  j  is  an  observer  of  i.  Thus, 
7  (M)  is  that  directed  graph  whose  adjacency  matrix  is  the 
transpose  of  the  matrix  obtained  by  replacing  all  nonzero 
entries  in  M  with  ones. 

2)  Connectivity:  There  are  various  notions  of  connectiv¬ 
ity  which  are  useful  in  the  study  of  the  convergence  of 
products  of  stochastic  matrices.  Perhaps  the  most  familiar 
of  these  is  the  idea  of  “strong  connectivity.”  A  directed 
graph  is  strongly  connected  if  there  is  a  directed  path  be¬ 
tween  each  pair  of  distinct  vertices.  A  directed  graph  is 
weakly  connected  if  there  is  an  undirected  path  between 
each  pair  of  distinct  vertices.  There  are  other  notions  of 
connectivity  which  are  also  useful  in  this  context.  To 
define  several  of  them,  let  us  agree  to  call  a  vertex  i  of  a 


directed  graph  G,  a  root  of  G  if  for  each  other  vertex  j  of  G, 
there  is  a  directed  path  from  i  to  j.  Thus,  i  is  a  root  of  G,  if 
it  is  the  root  of  a  directed  spanning  tree  of  G.  We  will  say 
that  G  is  rooted  at  i  if  i  is  in  fact  a  root.  Thus,  G  is  rooted  at 
i  just  in  case  each  other  vertex  of  G  is  reachable  from 
vertex  i  along  a  directed  path  within  the  graph.  G  is 
strongly  rooted  at  i  if  each  other  vertex  of  G  is  reachable 
from  vertex  i  along  a  directed  path  of  length  1.  Thus,  G  is 
strongly  rooted  at  i  if  i  is  a  neighbor  of  every  other  vertex  in 
the  graph.  By  a  rooted  graph  G  is  meant  a  directed  graph 
which  possesses  at  least  one  root.  A  strongly  rooted  graph  is 
a  graph  which  has  at  least  one  vertex  at  which  it  is  strongly 
rooted.  Note  that  a  nonnegative  matrix  M  €  Rnx"  has  a 
strongly  rooted  graph  if  and  only  if  it  has  a  positive  col¬ 
umn.  Note  that  every  strongly  connected  graph  is  rooted 
and  every  rooted  graph  is  weakly  connected.  The  converse 
statements  are  false.  In  particular  there  are  weakly  con¬ 
nected  graphs  which  are  not  rooted  and  rooted  graphs 
which  are  not  strongly  connected. 

3)  Composition:  Since  we  will  be  interested  in  products 
of  stochastic  matrices,  we  will  be  interested  in  graphs  of 
such  products  and  how  they  are  related  to  the  graphs  of  the 
matrices  comprising  the  products.  For  this  we  need  the 
idea  of  “composition”  of  graphs.  Let  Gp  and  Gq  be  two 
directed  graphs  with  vertex  set  V.  By  the  composition  of  Gp 
with  Gq,  written  Gq  o  Gp,  is  meant  the  directed  graph 
with  vertex  set  V  and  arc  set  defined  in  such  a  way  so  that 
(i,j)  is  an  arc  of  the  composition  just  in  case  there  is  a 
vertex  k  such  that  (i,  fe)  is  an  arc  of  Gp  and  (k,j)  is  an  arc  of 
Gq.  Thus,  (i,  j)  is  an  arc  in  Gq  o  Gp  if  and  only  if  i  has  an 
observer  in  Gp  which  is  also  a  neighbor  of  j  in  Gq.  Note 
that  composition  is  an  associative  binary  operation;  be¬ 
cause  of  this,  the  definition  extends  unambiguously  to  any 
finite  sequence  of  directed  graphs  61,62, .  .  . ,  G&,  with  the 
same  vertex  set. 

Composition  and  matrix  multiplication  are  closely 
related.  In  particular,  the  graph  of  the  product  of  two 
nonnegative  matrices  Mi,  M2  €  Rnxn  is  equal  to  the  com¬ 
position  of  the  graphs  of  the  two  matrices  comprising  the 
product.  In  other  words,  7(M2Mi)  =  7 (M2)  o  7(Mi). 

If  we  focus  exclusively  on  graphs  with  self-arcs  at  all 
vertices,  more  can  be  said.  In  this  case,  the  definition  of 
composition  implies  that  the  arcs  of  both  Gp  and  Gq  are 
arcs  of  Gq  o  Gp;  the  converse  is  false.  The  definition  of 
composition  also  implies  that  if  Gp  has  a  directed  path 
from  i  to  k  and  Gq  has  a  directed  path  from  k  to  j,  then 
Gq  o  Gp  has  a  directed  path  from  i  to  j.  These  implications 
are  consequences  of  the  requirement  that  the  vertices  of 
the  graphs  in  question  all  have  self-arcs.  It  is  worth  em¬ 
phasizing  that  the  union  of  the  arc  sets  of  a  sequence  of 
graphs  Gi,  G2, .  .  .  ,  Gk  with  self- arcs  must  be  contained  in 
the  arc  set  of  their  composition.  However,  the  converse  is 
not  true  in  general  and  it  is  for  this  reason  that  compo¬ 
sition  rather  than  union  proves  to  be  the  more  useful 
concept  for  our  purposes. 
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4)  Convergability:  It  is  of  obvious  interest  to  have  a 
clear  understanding  of  what  kinds  of  stochastic  matrices 
within  an  infinite  product  guarantee  that  the  infinite 
product  converges.  There  are  many  ways  to  address  this 
issue  and  many  existing  results.  Here  we  focus  on  just 
one  issue. 

Let  S  denote  the  set  of  all  stochastic  matrices  in  Rnxn 
with  positive  diagonal  entries.  Call  a  compact  subset 
AT  C  S  convergable  if  for  each  infinite  sequence  of  ma¬ 
trices  Mi ,  M2 ,  M3 , .  .  .  from  AT ,  the  sequence  of  products 
Mi,M2Mi,  M3M2M1, .  .  .  converges  exponentially  fast  to  a 
matrix  of  the  form  lc.  Convergability  can  be  characterized 
as  follows. 

Theorem  4:  Let  TZ  denote  the  set  of  all  matrices  in  S 
with  rooted  graphs.  Then,  a  compact  subset  AT  C  S  is 
convergable  if  and  only  if  AT  C  TZ. 

The  theorem  implies  that  7Z  is  the  largest  subset  of 
n  X  n  stochastic  matrices  with  positive  diagonal  entries 
whose  compact  subsets  are  all  convergable.  7 Z  itself  is  not 
convergable  because  it  is  not  closed  and  thus  not 
compact. 

Proof  of  Theorem  4:  The  fact  that  any  compact  subset 
of  TZ  is  convergable  is  more  or  less  well  known  from  the 
work  reported  in  [32];  the  statement  also  follows  from 
Proposition  11  of  [33].  To  prove  the  converse,  suppose  that 
AT  C  S  is  convergable.  Then,  by  continuity,  every  suffi¬ 
ciently  long  product  of  matrices  from  AT  must  be  a  matrix 
with  a  positive  column.  Therefore,  the  graph  of  every  suf¬ 
ficiently  long  product  of  matrices  from  AT  must  be 
strongly  rooted.  It  follows  from  Proposition  5  of  [33]  that 
AT  must  be  a  subset  of  TZ.  ■ 

Although  doubly  stochastic  matrices  are  stochastic, 
convergability  for  classes  of  doubly  stochastic  matrices  has 
a  different  characterization  than  it  does  for  classes  of 
stochastic  matrices.  Let  T>  denote  the  set  of  all  doubly 
stochastic  matrices  in  S.  In  the  following,  we  will  prove 
Theorem  5. 

Theorem  5:  Let  W  denote  the  set  of  all  matrices  in  T> 
with  weakly  connected  graphs.  Then,  a  compact  subset 
AT  CP  is  convergable  if  and  only  if  AT  C  W. 

The  theorem  implies  that  W  is  the  largest  subset  of 
n  X  n  doubly  stochastic  matrices  with  positive  diagonal 
entries  whose  compact  subsets  are  all  convergable.  Like 
TZ,  W  is  not  convergable  because  it  is  not  compact. 

An  interesting  set  of  stochastic  matrices  in  S  whose 
compact  subsets  are  known  to  be  convergable  is  the  set  of 
all  “scrambling  matrices.”  A  matrix  S  £  S  is  scrambling  if 
for  each  distinct  pair  of  integers  i  and  j,  there  is  a  column  k 
of  S  for  which  S;fe  and  s.j,  are  both  nonzero  [26].  In  graph 
theoretic  terms  S  is  a  scrambling  matrix  just  in  case  its 
graph  is  “neighbor  shared”  where  by  neighbor  shared  we 
mean  that  each  distinct  pair  of  vertices  in  the  graph  share  a 
common  neighbor  [33].  Convergability  of  compact  subsets 
of  scrambling  matrices  is  tied  up  with  the  concept  of  the 


coefficient  of  ergodicity  [26]  which  for  a  given  stochastic 
matrix  S  £  S  is  defined  by 

r(s)  =  -maxY]  lSife  "  saI-  C1) 

2 

It  is  known  that  0  <  r(S)  <  1  for  all  S  £  S  and  that 

r(S)  <  1  (2) 

if  and  only  if  S  is  a  scrambling  matrix.  It  is  also  known  that 

r(S2S1)  <  t(S2)t(S1),  sl9  s2  e  S.  (3) 

It  can  be  shown  that  (2)  and  (3)  are  sufficient  conditions  to 
ensure  that  any  compact  subset  of  scrambling  matrices  is 
convergable.  But  r(-)  has  another  role.  It  provides  a  worst 
case  convergence  rate  for  any  infinite  product  of  scram¬ 
bling  matrices  from  a  given  compact  set  C  C  S.  In  parti¬ 
cular,  it  can  be  easily  shown  that  as  i  — >  00,  any  product 
SjS ;_i  •  •  •  S2S1  of  scrambling  matrices  Si  £  C  converges  to 
a  matrix  of  the  form  lc  as  fast  as  A1  converges  to  zero 
where 

A  =  max  t(S). 

SeC 

This  preceding  discussion  suggests  the  following  ques¬ 
tion.  Can  analogs  of  the  coefficient  of  ergodicity  satisfying 
formulas  like  (2)  and  (3)  be  found  for  the  set  of  stochastic 
matrices  with  rooted  graphs  or  perhaps  for  the  set  of 
doubly  stochastic  matrices  with  weakly  connected  graphs? 
In  the  following,  we  will  provide  a  partial  answer  to  this 
question  for  the  case  of  stochastic  matrices  and  a  complete 
answer  for  the  case  of  doubly  stochastic  matrices.  Our 
approach  will  be  to  appeal  to  certain  types  of  seminorms  of 
stochastic  matrices. 

C.  Seminorms 

Let  ||  •  ||p  be  the  induced  p-norm  on  Rmxn.  We  will  be 
interested  in  p  =  l,2,oo.  Note  that  for  a  nonnegative 
matrix  A 

IIaHj  =  max  column  sum  A 

||A||2  =  y/KA'A) 

1 1 A 1 1 00  =  max  row  sum  A 

where  is  the  largest  eigenvalue  of  A'A;  that  is,  the 

square  of  the  largest  singular  value  of  A.  For  M  £  Rmxn 
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define 

|M|  =  min  ||M-  lc|| 

p  ceR1  F 

As  defined,  |  •  |  is  nonnegative  and  |M|  <  ||M||p;  clearly 
|/iM|p  =  |/i||M|p  for  all  real  numbers  /i  so  |  •  |p  is  “posi¬ 
tively  homogeneous”  [27].  Let  Mj  and  M2  be  matrices  in 
Rmxn  and  let  Co,  Ci,  and  c2  denote  values  of  c  which  mini¬ 
mize  ||Mr  +M2  -  lc||p,  ||Mi  -  lc||p,  and  ||M2  -  lc||p,  re¬ 
spectively.  Note  that 

|M1+M2|p  =  ||M1+M2-lc0||p 

<||M1+M2-l(ci  +  c2)||p 
<||Mi-1Ci||p  +  ||M2-1c2||p 

=  |Mi|p  +  |m2|p. 

Thus,  the  triangle  inequality  holds.  These  properties  mean 
that  |  •  |p  is  a  seminorm.  |  ■  |p  behaves  much  like  a  norm.  For 
example,  if  N  is  a  submatrix  of  M,  then  |N|  <  |M|p.  How¬ 
ever,  |  •  |p  is  not  a  norm  because  |M|p  =  0  does  not  imply 
M  =  0;  rather  it  implies  that  M  =  lc  for  some  row  vector  c 
which  minimizes  ||M  —  lc||  .  For  our  purposes,  |  •  |  has  a 
particularly  important  property. 

Lemma  1:  Suppose  Ad  is  a  subset  of  Rnxn  such  that 
Ml  =  1  for  all  M  €  Ad.  Then 

|M2M1|p  <  | M2 1 p  |  Mx  | p .  (4) 

We  say  that  |  •  |p  is  submultiplicative  on  Ad. 

Proof  of  Lemma  1:  Let  Co,  Ci,  and  c2  denote  values  of  c 
which  minimize  ||M2Mi  —  lc||p,  ||Mi  —  lc||  ,  and 
||M2  —  lc||  respectively.  Then 

|M2M!|p  =  ||M2M!  -  lcollp 

<  ||M2M!  —  1(c2M1  +  Cl  —  c2lCl)||p 

=  ||M2Mr  —  1c2Mi  -  M21ci  +  lc2lci||p 
=  ||(M2-lc2)(M1-lc1)||p 

<  ||(A12  -  1c2)||p||(M1  -  lCl)||p 
=  l^lplMilp. 

Thus,  (4)  is  true.  ■ 


We  say  that  M  £  Rnxn  is  semicontractive  in  the  p-norm 
if  Mp  <  1.  In  view  of  Lemma  1,  the  product  of  semicon¬ 
tractive  matrices  in  Ad  is  thus  semicontractive.  The  im¬ 
portance  of  these  ideas  lies  in  the  following  fact. 

Proposition  3:  Suppose  Ad  is  a  subset  of  Rnx"  such  that 
Ml  =  1  for  all  M  £  Ad.  Let  p  be  fixed  and  let  Ad  be  a 
compact  set  of  semicontractive  matrices  in  Ad.  Let 

A  =  sup  |M| 

M 

Then,  for  each  infinite  sequence  of  matrices  M;  £  Ad, 
i  £  { 1,  2,  .  .  .},  the  matrix  product 

MjMj_!  •  •  •  Mr 

converges  as  i  — >  oo  as  fast  as  A'  converges  to  zero,  to  a 
rank  one  matrix  of  the  form  lc. 

Proof  of  Proposition  3:  Clearly  |M|p  <  A,  M  £  Ad. 
Moreover,  A  <  1  because  each  M  £  Ad  is  semicontractive 
and  because  Ad  is  compact.  Write  Mf  =  lc;  +  Tn  i  >  1, 
where  C;  is  a  value  of  c  which  minimizes  ||M;  —  lc||.  For 
i  >  1  set  Xj  =  M,Mj_i  •  •  •  Mi  and  Y,  =  TjTj_i  •  •  •  Tj. 
Clearly  |M,-|p  =  ||Tj||p,  i  >  1,  so 

ll^.llp  <  i>l.  (5) 

A  simple  computation  yields 

k 

Xk  =  Yk  +  J2  lCi Yi-i,  k  >  1  (6) 

i=l 

where  Y0  =  I.  Note  also  that  because  of  (5),  Yfe  tends  to 
zero  as  k  — >  oo.  We  claim  that  the  sequence  lCjYi-i, 
k  >  1,  has  a  limit.  To  prove  that  this  is  so  it  is  enough  to 
show  that  yj,  lcjYj^i,  k  >  1,  is  a  Cauchy  sequence. 
Towards  this  end  observe  that 

j+k  k  j+k 

J2  ic.Y,-,  -  iciYi-!  =  iciy-i>  i  > L 

i=l  i=l  i=fe+l 

Moreover 

lllCiYi-iHp  <  dA*"1,  i>l 
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where  d  =  sup;>1  ||lc,||  .  Therefore,  for  j  >  1 


To  proceed  observe  that  (6)  implies  that 


i+k 

i+k  i 

k 

Y  lc<Y-i 

<dY  A'_1  = dXk E A^ 1 

||xfe  -  lc||p  <  ||Yfc||p  + 

Y  ICjYj-i  -  lc 

i=fe+l 

P  i=fc+l  s=l 

i=l 

<dAfeVAs-1  =  d  — . 
^  1-  A 


S=1 


From  this,  (5),  and  (8),  it  follows  that 


Therefore 


i+k 


Y  iciYi-i  -  E  lc‘Yi- 


<  d 


xk 

1- A’ 


||Xfc  -  lc||p  <  Afe(l  +  —  A 


k  >  1. 


j,  k  >  1  (7)  This  completes  the  proof. 


1)  Case  p  =  1:  We  now  consider  in  more  detail  the  case 
when  p  =  1.  For  this  case,  it  is  possible  to  derive  an  explicit 
which  shows  that  JA=i  lciY;_i,  k  >  1,  is  a  Cauchy  se-  formula  for  the  seminorm  |M|j  of  a  nonnegative  matrix 

quence.  Therefore,  the  sequence  Xw=i  lei ^  i,  k  >  1,  has  a  M  €  Rnx". 
limit  which  we  denote  by  lc. 

Note  next  that  for  j,  k  >  1  Proposition  4:  Let  q  be  the  unique  integer  quotient  of  n 

divided  by  2.  Let  M  €  Rnx,‘  be  a  nonnegative  matrix. 

Then 

i+k 

Y  lCiYi-i  -  lc 


k 

Y  lCiYi-i  -  lc 

= 

i=l 

P 

'i+k 


-  y  lc>Y>-!  -  E  lc’Yi- 


iGjCi  iGS; 


E  lCjYj_i  -  lc 


< 


i+k 

Y  lCiY-1  -  lc 


+ 


i+k 


Y  lcjYi-i  -  Y  lc‘Y' 


where  Cj  and  Sj  are,  respectively,  the  row  indices  of  the  q 
largest  and  q  smallest  entries  in  the  jth  column  of  M. 

This  result  is  a  direct  consequence  of  the  following 
lemma  and  the  definition  of  |  ■  |j. 

Lemma  2:  Let  q  denote  the  unique  integer  quotient  of  n 
divided  by  2.  Let  y  be  a  nonnegative  n-vector  and  write  C 
and  S  for  the  row  indices  of  the  q  largest  and  q  smallest 
i_1  ‘  entries  in  y,  respectively.  Then 

P 


In  view  of  (7) 


bit  =  Ea*  ~Ea' 


i£jC  i€S 


Y  lCiYi-i  -  lc 


< 


i+k 

Y  Ic.Y.-i  -  lc 


1  -  A 


But  ||  lc;Yi_i  —  lc||p  tends  to  zero  as  j  — >  oo  so 
k 

Y  lCiYi-i  -  1 


<  d- 


Afe 

"l  -  A  ’ 


k  >  1.  (8) 


where  y;  is  the  ith  entry  in  y. 

Proof  of  Lemma  2:  Let  a  denote  the  n-vector  whose 
entries  Qj ,a2, ...  ,an  are  the  entries  of  y  relabeled  so  that 
ai  <  0-2  <  •  ’  ’  <  an.  Then 


Ha.  =  Y  ai  and 

ieC  i>(q+r)  ieS  i<q 


where  r  is  the  unique  integer  remainder  of  n  divided  by  2. 
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Moreover 


lb"  lxlli  =  11“  -  lx\\v  x£R. 

Therefore,  to  prove  the  lemma,  it  is  enough  to  show  that 


where  nij  is  the  median3  of  the  n  entries  in  the  jth  column 
of  S.  Therefore 


ISL  =  max  <2>  Sn  +  rmj  — 1>. 

;e{i,2,...,n}  ]  ^  < 


a  •.  =  nun 

1  xGR 


ia  -  i^iii  =  E  ai~Ya'- 

i>(q+r)  i<q 


(9) 


This  means  that  S  is  semicontractive  in  the  one-norm  just 
in  case 


Suppose  n  is  even  in  which  case  n  =  2q  and  r  =  0.  If 
q  =  1,  then  minxeR(|ai  -  x\  +  |a2  -  x|)  =  a2  -  ai  in 
which  case  (9)  holds.  Suppose  q  >  1.  For  fixed  k  £  {1,2, 

.  .  .  ,  q  —  1}  and  any  value  of  x  located  in  the  interval 
[at,  Qfe+i],  it  must  be  true  that  |q;  —  x|  ==  x  —  dj  for  i  <  k  and 
|qi  —  x|  =  cp  —  x  for  i  >  k  +  1.  Since  k  <  q,  the  number  of 
values  of  i  such  that  i  >  k  +  1  is  greater  than  the  number  of 
values  of  i  such  that  i  <  k.  Thus,  for  x  £  [aj,,  at+i],  the  sum 
| a,  —  x|  is  a  linear  polynomial  in  x  and  the  coefficient 
of  x  is  negative.  Since  this  is  true  for  all  k  £  {1,  2, .  .  . , 
q  —  1},  the  sum  |a;  —  x|  is  a  decreasing  function  of  x 

on  the  union  of  the  intervals  [aj.,  aj,+i],  k  =  1,  2, .  . .  ,  q  —  1; 
i.e.,  on  [ai,  aj.  By  similar  reasoning  ^!'=i  lai  —  x|  is  an  in¬ 
creasing  function  of  x  on  the  interval  [aq+i,  a„] .  Meanwhile, 
for  values  of  x  £  [aq,  aq+i],  clearly  |aq  —  x|  =  x  —  aq  and 
|  dq+i  —  x|  =  aq+i  —  x,  so  the  sum  y{"_1  |  a;  —  x|  is  a  constant. 
But  |ai  —  x|  is  a  continuous  function  of  x.  Therefore, 
y;_i  |di  —  x|  is  nonincreasing  for  x  <  aq  and  nondecreas¬ 
ing  for  x  >  aq.  Therefore,  a  value  of  x  which  minimizes 
TT_i  |a;  —  x|  is  x  =  aq.  Equation  (9)  follows  at  once. 

Now  suppose  n  is  odd  in  which  case  r  =  1.  For  fixed 
k  £  { 1,  2, .  .  .  ,  q}  and  any  value  of  x  located  in  the  interval 
[dfe,  afe+i],  it  must  be  true  that  |a,  —  x|  =  x  —  af  for  i  <  k 
and  |<3j  —  x|  =  a;  —  x  for  i  >  k  +  1.  Since  k  <  q,  the  num¬ 
ber  of  values  of  i  such  that  i  >  k  +  1  is  greater  than  the 
number  of  values  of  i  such  that  i  <  k.  Thus,  for  x  £  ja/;, 
Q(,+i],  the  sum  Y^i=i  lai  —  x|  is  a  linear  polynomial  in  x  and 
the  coefficient  of  x  is  negative.  Since  this  is  true  for  all 
fe  £  {1,2,...,  q},  the  sum  X^!'=i  lai  —  xl  is  a  decreasing 
function  of  x  on  the  union  of  the  intervals  [ak,ak+i],  k  = 
1,  2, .  .  . ,  q;  i.e.,  on  [aijdq+i].  By  similar  reasoning 
|Qi  —  x|  is  an  increasing  function  of  x  on  the  interval 
[dq+ija,,].  Therefore,  the  unique  value  of  x  which  mini¬ 
mizes  |di  —  x|  is  x  =  aq+ 1  =  aq+r.  Equation  (9)  fol¬ 
lows  at  once.  ■ 

Consider  now  the  case  when  M  is  a  doubly  stochastic 
matrix  S.  Then,  the  column  sums  of  S  are  all  equal  to  1. 
This  implies  that  ISlj  <  1  because  |S|j  <  HS^  =  1.  The 
column  sums  all  equaling  one  also  imply  that 


]CS'/  +  9m>  < 

ieCi 


j  £  {1,2, .  .  .  ,n}. 


E 

i£jCj 


s  ii  +  1 


E  s‘i  =  1;  7  e  {1, 2, 


We  are  led  to  the  following  result. 

Theorem  6:  Let  q  be  the  unique  integer  quotient  of  n 
divided  by  2.  Let  S  £  Rnxn  be  a  doubly  stochastic  matrix. 
Then,  |S|  <  1.  Moreover,  S  is  a  semicontraction  in  the  one- 
norm  if  and  only  if  the  number  of  nonzero  entries  in  each 
column  of  S  exceeds  q. 

Note  that  the  doubly  stochastic  matrix 


ro.5 
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0 
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iG<S, 


has  a  weakly  connected  graph  but  is  not  a  semicontraction 
for  p  =  1.  Thus,  this  particular  seminorm  is  not  as  useful  as 
we  would  like  for  gossiping  problems. 

It  is  possible  to  compare  this  seminorm  with  the  coef¬ 
ficient  of  ergodicity.  Observe  that  while  the  preceding 
matrix  is  not  a  semicontraction  it  is  a  scrambling  matrix. 
Thus,  for  this  example,  r(S)  <  (S^  =  1.  On  the  other  hand, 
there  are  also  doubly  stochastic  matrices  which  are  semi¬ 
contractions  but  which  are  not  scrambling  matrices.  An 
example  of  this  is  the  matrix 


S  = 


Thus,  for  this  example,  ISlj  <  r(S)  =  1,  which  means  that 
there  are  situations  when  it  may  be  more  advantageous  to 

3The  median  of  a  finite  set  of  real  numbers  is  the  “middle  value”  of  the 
set.  More  precisely,  suppose  that  A  is  a  set  of  n  real  numbers  which  are 
ai  5-:  <*2  ■  ■  ■  <  An  and  let  q  and  r  be,  respectively,  the  unique  integer 

quotient  and  remainder  of  n  divided  by  2.  If  n  is  odd,  the  median  of  A  is  aq+r. 
If  n  is  even,  the  median  of  A  is  defined  to  be  the  average  of  aq  and  aq+i. 
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use  the  seminorm  |  •  |:  to  compute  convergence  rates  than 
to  appeal  to  the  coefficient  of  ergodicity. 

To  complete  the  proof,  it  must  be  shown  that 

2)  Case  p  =  oo:  Note  that  in  this  case  |  S 1^  <  1  for  any 
stochastic  matrix  because  IS^  <  HS^  =  1.  Although 
not  at  all  obvious,  it  turns  out  that  IS^  equals  the  well- 
known  coefficient  of  ergodicity  discussed  earlier  and 
defined  by  (1).  This  is  an  immediate  consequence  of 
Proposition  5  which  is  stated  below.  Unfortunately,  the 
last  example  in  the  preceding  section  shows  that  there  are 
doubly  stochastic  matrices  with  weakly  connected  graphs 
which  are  not  scrambling  matrices.  Thus,  this  particular 
seminorm  is  also  not  useful  for  our  purposes. 

min  max  d(c,  af)  <  -  max  d(a,-,  ak) . 
c  i  2  j,k 

For  this  it  is  enough  to  show  that  there  exists  a  closed 
ball  B  in  Rlxn  with  radius  r=  (1/2)  maxyt,  d(aj,  ak)  such 
that  B  D  A.  Towards  this  end4  let  B  denote  the  ball 

B  =  <  x  :  x  £  Rlx",  |xj|  <  r  1 . 

Proposition  5:  Let  A  £  Rnxn  be  a  nonnegative  matrix. 
Then 

1  i=l  J 

1  " 

lAloo  =  AmaxX]laife-ajfel- 

The  boundary  of  B,  dB,  consists  of  2"_1  pairs  of  parallel 
(n  —  l)-hyperplanes.  Consider  one  of  these  pairs,  7i\  : 
Xi  +  x2  +  ■  ■  ■  +  xn  =  r  and  7Y2  :  xk  +  x2  +  ■  ■  ■  +  xn  =  — r. 
For  any  y  £  Hi  and  z  £  7i2 

The  proof  of  Proposition  5  depends  on  the  following 
lemma. 

n  n 

bi  -  Zil  >  ~  Zi)  =  2r 

i— 1  i— 1 

Lemma  3:  Suppose  A  =  {ai,  02, . . . ,  am}  is  a  set  of 
m  >  1  row  vectors  in  Rlxn.  Let  d(x,y )  denote  the  metric 

which  shows  that  the  distance  of  the  two  hyperplanes 
equals  2r.  Thus,  there  must  exist  a  real  number  s  such  that 

d(x,y)  =  Y  lXi  “^1’  x,yeRlx" 

f—i 

every  a;  £  A  lies  between  or  on  the  two  parallel 
hyperplanes 

where  Xi  and  y j  are  the  ith  entries  of  x  and  y,  respectively. 
Then 

Xi  +  x2  +  •  •  •  +  xn  =  s  +  r 

Xi  +  x2  +  •  •  •  +  x„  =  s  —  r. 

min  maxd(c,aj)  =  -max d(ah  ak). 

ceRlxn  >  2  j,k 

Using  the  same  arguments,  the  other  2n_1  —  1  pairs  of 
parallel  hyperplanes  also  can  be  shown  to  exist.  Therefore, 
A  can  be  contained  in  a  closed  set  bounded  by  the  2"~1 

Proof  of  Lemma  3:  For  any]  and  k,  and  any  row  vector 
c  €  Rlx",  d(aj,c)  +  d(c,Gtfc)  >  d(aj,ak )  because  of  the  tri¬ 
angle  inequality.  Clearly  maXid(c,ai)  >  (1/2) (d(a,-,  c)  + 
d(c,ak)),  so 

pairs  of  parallel  (n  —  l) -hyperplanes  which  is  a  closed  ball 
with  radius  (l / 2)  maxy,  d(ap  aj.)  in  Rlxn.  ■ 

3)  Case  p  =  2:  For  the  case  when  p  =  2  it  is  also  pos¬ 
sible  to  derive  an  explicit  formula  for  the  seminorm  \M\2  of 
a  nonnegative  matrix  M  £  Rnx,‘.  Towards  this  end  note 

maxd(c,ai)  >  -d(a/,  ak) 

that  for  any  x  £  Rn,  the  function 

for  all  ]',  k.  In  particular,  maXj  d(c,  dj)  >  (1/2)  maxj^  d(aj, 
ak).  Since  this  is  true  for  all  c 

g(x,  c)  =  x'(M  —  1  c)'(M  —  lc)x 

=  xMlMx  —  2x,M,lcx  +  n(cx)2 

min  max  d(c,  af)  >  -  max  d(a.  ,  ak) . 

c  i  2  j,k 

AVe  are  indebted  to  Chun-Yi  Sun  (Department  of  Mathematics,  Yale 
University)  for  pointing  this  out  to  us. 
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attains  its  minimum  with  respect  to  c  at 

1  . 

c  =  -  I'M. 
n 

This  implies  that 

|M|2  =  ||PM||2  =  \J  /i{M'P'PM} 

where  P  =  I  —  (l/n)ll/  and,  for  any  symmetric  matrix  T, 
is  the  largest  eigenvalue  of  T.  We  are  led  to  the 
following  result. 

Proposition  6:  Let  M  £  Rnxn  be  a  nonnegative  matrix. 
Then,  |M|2  is  the  largest  singular  value  of  the  matrix  PM 
where  P  is  the  orthogonal  projection  on  the  orthogonal 
complement  of  the  span  of  1. 

Now  suppose  that  M  is  a  doubly  stochastic  matrix  S. 
Then,  S'S  is  also  doubly  stochastic  and  l/S  =  V .  The  latter 
and  Proposition  6  imply  that 


More  can  be  said. 

Lemma  4:  If  S  is  doubly  stochastic,  then  pt{S'S— 
(l/n)ll'}  is  the  second  largest  eigenvalue  of  S'S. 

Proof  of  Lemma  4:  Since  S'S  is  symmetric  it  has  or¬ 
thogonal  eigenvectors  one  of  which  is  1.  Let  1,X2,..., 
xn  be  such  a  set  of  eigenvectors  with  eigenvalues  1,  A2, 
.  .  .  ,  A„.  Then,  S'S1  =  1  and  S'Sxj  =  AjXj,  i  £  {2,  3, .  .  .  ,n}. 
Clearly  (S'S  —  (l/n)ll,)l  =  0  and  (S'S  —  (l/njll/jx;  = 
AjX;,  i€  {2,3,...,  n} .  Since  1  is  the  largest  eigenvalue 
of  S'S  it  must  therefore  be  true  that  the  second  largest 
eigenvalue  S'S  is  the  largest  eigenvalue  of 
S'S-  (l/n)ll'.  ■ 

We  summarize  as  follows. 

Theorem  7:  For  p  =  2,  the  seminorm  of  a  doubly 
stochastic  matrix  S  is  the  second  largest  singular  value 
of  S. 

There  is  another  way  to  think  about  what  this  theorem 
implies.  Prompted  by  the  work  in  [8]  and  [23],  suppose 
one  wants  to  measure  in  the  sense  of  a  2-norm  ||  •  ||,  how 
much  closer  an  n-vector  x  gets  to  the  average  vector 
z=(l/n)  1  l'x  when  it  is  multiplied  by  a  doubly  stochastic 
matrix  S.  In  other  words  how  does  the  norm  ||Sx  —  z| 
compare  with  |jx  —  z||?  To  address  this  question,  note  first 


that  x  —  z  £  O  where  O  is  the  orthogonal  complement  of 
the  span  of  1.  Note  next  that 


||Sx-z||2  =  ||S(x-z)f<  I  sup^— y— j  ||x  z;||2. 


But  supyg oy'S'Sy/y'y  is  the  second  largest  eigenvalue  of 
S'S  which  in  turn  is  the  square  of  the  second  largest 
singular  value  of  S.  In  other  words,  ||Sx  — z||  < 
|S|2||x  —  z||.  Thus,  Sx  is  always  as  close  to  the  average 
vector  z  as  x  is  and  is  even  closer  if  |S|2  is  a  contraction. 

In  the  light  of  Theorem  7,  we  are  now  in  a  position  to 
characterize  in  graph-theoretic  terms  those  doubly  sto¬ 
chastic  matrices  with  positive  diagonal  entries  which  are 
semicontractions  for  p  =  2. 

Theorem  8:  Let  S  be  a  doubly  stochastic  matrix  with 
positive  diagonal  entries.  Then,  |S|2  <  1.  Moreover,  S  is  a 
semicontraction  in  the  2-norm  if  and  only  if  the  graph  of  S 
is  weakly  connected. 

To  prove  this  theorem  we  need  several  concepts  and 
results.  Let  G  denote  a  directed  graph  and  write  G/  for  that 
graph  which  results  when  the  arcs  in  G  are  reversed;  i.e., 
the  dual  graph.  Call  a  graph  symmetric  if  it  is  equal  to  its 
dual.  Note  that  in  the  case  of  a  symmetric  graph,  the  three 
properties  of  being  rooted,  strongly  connected,  and 
weakly  connected  are  equivalent.  Note  also  that  if  G  is 
the  graph  of  a  nonnegative  matrix  M  with  positive  diag¬ 
onal  entries,  then  G  is  the  graph  of  M'  and  G’  o  G  is  the 
graph  of  M'M. 

Lemma  5:  A  directed  graph  G  with  self-arcs  at  all 
vertices  is  weakly  connected  if  and  only  if  G'  o  G  is 
strongly  connected. 

Proof  of  Lemma  5:  Since  G  has  self- arcs  at  all  vertices 
so  does  G  .  This  implies  that  the  arc  set  of  G'  o  G  contains 
the  arc  sets  of  G  and  G'.  Thus,  for  any  undirected  path  in 
G  between  vertices  i  and  j,  there  must  be  a  corresponding 
directed  path  in  G'oG  between  the  same  two  vertices. 
Thus,  if  G  is  weakly  connected,  G'oG  must  be  strongly 
connected. 

Now  suppose  that  (i,j)  is  an  arc  in  G' o  G.  Then, 
because  of  the  definition  of  composition,  there  must  be  a 
vertex  k  such  that  (i,  k)  is  an  arc  in  G  and  (fe,  j )  is  an  arc  in 
G  .  This  implies  that  (i,  k )  and  (j,  k)  are  arcs  in  G.  Thus,  G 
has  an  undirected  path  from  i  to  j.  Now  suppose  that 
(i,  Vi) ,  (vi ,  V2) , . .  . ,  (vq,j)  is  a  directed  path  in  Gf  o  G  be¬ 
tween  i  and  j.  Between  each  pair  of  successive  vertices 
along  this  path  there  must  therefore  be  an  undirected  path 
in  G.  Thus,  there  must  be  an  undirected  path  in  G  be¬ 
tween  i  and  j.  It  follows  that  if  G(oG  is  strongly  con¬ 
nected,  then  G  is  weakly  connected.  ■ 
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Lemma  6:  Let  T  be  a  stochastic  matrix  with  positive 
diagonal  entries.  If  T  has  a  strongly  connected  graph,  then 
the  magnitude  of  its  second  largest  eigenvalue  is  less 
than  1.  If,  on  the  other  hand,  the  magnitude  of  the  second 
largest  eigenvalue  of  T  is  less  than  1,  then  the  graph  of  T  is 
weakly  connected. 

Proof  of  Lemma  6:  Suppose  that  the  graph  of  T  is 
strongly  connected.  Then,  via  Theorem  6.2.24  of  [27],  T  is 
irreducible.  Thus,  there  is  an  integer  k  such  that 
(I  +  T)k  >  0.  Since  T  has  positive  diagonal  entries,  this 
implies  that  Tk  >  0.  Therefore,  T  is  primitive  [27].  Thus, 
by  the  Perron-Frobenius  theorem  [26],  T  can  have  only 
one  eigenvalue  of  maximum  modulus.  Since  the  spectral 
radius  of  T  is  1  and  1  is  an  eigenvalue,  the  magnitude  of  the 
second  largest  eigenvalue  of  T  must  be  less  than  1. 

To  prove  the  converse,  suppose  that  T  is  a  stochastic 
matrix  whose  second  largest  eigenvalue  in  magnitude  is 
less  than  1.  Then 

lim  T1  =  lc 

i— >oo 

for  some  row  vector  c.  Suppose  that  the  graph  of  T  is  not 
weakly  connected.  Therefore,  if  q  denotes  the  number  of 
weakly  connected  components  of  the  graph,  then  q  >  1. 
This  implies  that  T  =  P' DP  for  some  permutation  matrix  P 
and  block  diagonal  matrix  D  with  q  blocks.  Since  D  =  PTP', 
D  is  also  stochastic.  Thus,  each  of  its  q  diagonal  blocks  is 
stochastic.  Since  Tl  converges  to  lc,  Dl  must  converge  to  a 
matrix  of  the  form  lc.  But  this  is  clearly  impossible 
because  lc  cannot  have  q  >  1  diagonal  blocks.  ■ 

Proof  of  Theorem  8:  Let  S  be  a  doubly  stochastic 
matrix  with  positive  diagonal  entries.  Then,  1  is  the  largest 
singular  value  of  S  because  S'S  is  doubly  stochastic.  From 
this  and  theorem  7  it  follows  that  |S|2  <  1. 

Suppose  S  is  a  semicontraction.  Then,  in  view  of 
Theorem  7,  the  second  largest  eigenvalue  of  S'S  is  less 
than  1.  Thus,  by  Lemma  6,  the  graph  of  S'S  is  weakly 
connected.  But  S'S  is  symmetric  so  its  graph  must  be 
strongly  connected.  Therefore,  by  Lemma  5,  the  graph  of 
S  is  weakly  connected. 

Now  suppose  that  the  graph  of  S  is  weakly  connected. 
Then,  the  graph  of  S'S  is  strongly  connected  because  of 
Lemma  5.  Thus,  by  Lemma  6,  the  magnitude  of  the  second 
largest  eigenvalue  of  S'S  is  less  than  1.  From  this  and 
Theorem  7  it  follows  that  S  is  a  semicontraction.  ■ 

Proof  of  Theorem  5:  Let  A4  be  any  compact  subset  of 
W.  In  view  of  Theorem  8,  each  matrix  in  A4  is  a  semicon¬ 
traction  in  the  2-norm.  From  this  and  Proposition  3,  it 
follows  that  A4  is  convergable. 

Now  suppose  that  A4  is  convergable  and  let  S  be  a 
matrix  in  A4.  Then,  S'  converges  to  a  matrix  of  the  form  lc 
as  i  — *  oo.  This  means  that  the  second  largest  eigenvalue  of 
S  must  be  less  than  1  in  magnitude.  Thus,  by  Lemma  6,  S 
must  have  a  weakly  connected  graph.  ■ 


The  importance  of  Theorem  8  lies  in  the  fact  that  the 
matrices  in  every  convergable  set  of  doubly  stochastic 
matrices  are  contractions  in  the  2-norm.  In  view  of 
Proposition  3,  this  enables  one  to  immediately  compute  a 
rate  of  convergence  for  any  infinite  product  of  matrices 
from  any  given  convergable  set.  The  coefficient  of 
ergodicity  mentioned  earlier  does  not  have  this  property. 
If  it  did,  then  every  doubly  stochastic  matrix  with  a  weakly 
connected  graph  would  have  to  be  a  scrambling  matrix. 
The  following  counterexample  shows  that  this  is  not  the 
case: 
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In  particular,  S  is  a  doubly  stochastic  matrix  with  a  weakly 
connected  graph  but  it  is  not  a  scrambling  matrix. 

D.  Contraction  Coefficient 

By  the  contraction  coefficient  of  a  gossip  matrix  M  is 
meant  the  seminorm  |M|2.  The  main  result  we  want  to 
prove  is  as  follows. 

Theorem  9:  A  gossip  matrix  M  is  complete  if  and  only  if 
its  contraction  coefficient  is  less  than  one. 

Before  turning  to  a  proof  of  this  theorem,  let  us  con¬ 
sider  its  consequences.  As  in  the  hypothesis  of  Theorem  1, 
let  M(l), M(2), M(3), . .  .  denote  the  gossiping  matrices 
corresponding  to  an  infinite  sequence  of  single  gossips 
which  is  repetitively  complete  with  period  T.  Our  aim  is  to 
explain  how  fast  the  matrix  product  —  1)  •  ■  ■  M(l) 

converges  to  (l/n)ll,J  and  in  so  doing  provide  a  proof  of 
Theorem  1.  Towards  this  end  note  first  that  each  M(t )  £  V, 
the  set  of  all  n  x  n  single-gossip  matrices.  Let  C  denote  the 
set  of  all  complete  gossip  matrices  which  are  products  of 
exactly  T  single-gossip  matrices  from  V.  Note  that  C  is  a 
compact  set  because?7  is.  For  each  S  £  C,  let  |S|2  denote  the 
contraction  coefficient  of  S.  Then,  in  view  of  Theorem  9, 
|S|2  <  1,  S  £  C.  Therefore,  the  nonnegative  number 

H  =  max  |S|2 

is  less  than  one.  Next  observe  that  since  M(1),M(2), 
M(3), . . .  corresponds  to  a  repetitively  complete  gossip  se¬ 
quence,  each  matrix  N;  =  M(iT)M(iT  —  l)((i  —  1)T  +  1), 
i  >  1,  must  be  in  C;  thus,  |N;|2  <  /i,  i  >  1.  It  follows  from 
Proposition  3  that  as  i  — >  oo,  the  matrix  product 
NjNi-i  ■  ■  ■  N i  converges  to  (1  /n)ll'  as  fast  at  fT  converges 
to  zero.  Thus,  if  we  define  A  =  then  as  t  — >  oo,  the 
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matrix  product  —  l)  •  •  •  M(l)  also  converges  to 

(l/n)i:i/,  in  this  case  as  fast  as  Ar  converges  to  zero.  The 
final  assertion  of  Theorem  1  follows  at  once. 

We  now  turn  to  the  proof  of  Theorem  9.  For  this  we 
need  several  preliminary  results. 

Lemma  7:  Let  G  and  H  be  directed  graphs  on  the  same  n 
vertices.  Suppose  that  both  graphs  have  self- arcs  at  all 
vertices.  If  there  is  an  undirected  path  from  i  to  j  in  H  o  G, 
then  there  is  an  undirected  path  from  i  to  j  in  the  union  of 
H  and  G. 

Proof  of  Lemma  7:  First  suppose  that  there  is  an 
undirected  path  of  length  one  between  vertices  i  and  j  in 
HoG.  Then,  either  (i,  j)  or  (j,  i )  must  be  an  arc  in  H  o  G. 
Without  loss  of  generality  suppose  that  (i,j)  is  an  arc  in 
HoG.  Then,  because  of  the  definition  of  composition, 
there  must  be  a  vertex  k  such  that  (i,  k )  is  an  arc  in  G  and 
(fe,  j )  is  an  arc  in  H.  This  implies  that  (i,  k)  and  (fe,  j )  are 
arcs  in  H  U  G.  Thus,  H  U  G  has  a  directed  path  from  i  to  j. 
Therefore,  H  U  G  has  an  undirected  path  from  i  to  j.  Now 
suppose  that  (i,  Vi),  (vi,V2),  .  . .  ,  (vq,  j)  is  an  undirected 
path  in  H  o  G  between  i  and  j.  Between  each  pair  of  suc¬ 
cessive  vertices  along  this  path  there  must  therefore  be  an 
undirected  arc  in  H  o  G.  Therefore,  between  each  pair  of 
successive  vertices  along  this  path  there  must  be  an  undi¬ 
rected  arc  in  H  U  G.  Thus,  there  must  be  an  undirected 
path  in  H  U  G  between  i  and  j.  ■ 

It  is  obvious  that  the  preceding  lemma  extends  from 
two  graphs  to  a  finite  set  of  directed  graphs.  In  the  follow¬ 
ing,  we  appeal  to  this  extension  without  special  mention. 
The  next  result  characterizes  completeness  of  a  gossip 
matrix  in  terms  of  a  property  of  its  graph. 

Lemma  8:  A  gossip  matrix  is  complete  if  and  only  if  its 
graph  is  weakly  connected. 

Proof  of  Lemma  8:  Since  M  is  a  gossip  matrix,  there 
exist  single-gossip  matrices  Pi,  P2,  ■  ■  ■  ,  Pm  such  that  M  = 
PmPm-i  ■  ■  ■  Pi-  Write  (j i,  ki )  for  the  edge  in  N  associated 
with  P j  and  let  G  denote  the  spanning  subgraph  of  N 
whose  edges  are  (ji,  h),  (j2,k2),  ...,  ( jm ,  km ).  Since  each  Pt 
is  a  single-gossip  matrix,  its  graph  7(Pj)  must  contain  arcs 
from  ji  to  kj  and  from  fej  to  ]).  This  means  that  the  union  of 
the  'y(Pi),  written  U,  will  be  weakly  connected  if  and  only 
if  G  is  a  connected  graph.  Therefore,  weak  connectivity  of 
U  is  equivalent  to  M  being  complete. 

In  general,  the  composition  of  two  directed  graphs 
contains  the  union  of  the  two  graphs  whenever  the  two 
graphs  in  question  have  self-arcs  at  all  vertices.  This  means 
that  U  must  be  a  subgraph  of  7 (M)  because  each  vertex  of 
each  7(Pf)  has  a  self-arc.  But  it  has  just  been  shown  that 
completeness  of  M  implies  weak  connectivity  of  U.  It 
follows  that  completeness  of  M  must  also  imply  weak 
connectivity  of  7 (M). 

Suppose  next  that  7 (M)  is  weakly  connected.  Then,  U 
must  be  weakly  connected  because  of  Lemma  7.  Therefore, 
M  must  be  complete.  ■ 


Proof  of  Theorem  9:  Since  M  is  a  product  of  matrices 
with  positive  diagonal  entries,  M  has  positive  diagonal 
entries.  By  Lemma  8,  M  is  complete  if  and  only  if  7 (M) 
is  weakly  connected.  By  Theorem  8,  any  doubly  stochas¬ 
tic  matrix  with  positive  diagonal  entries  has  a  contrac¬ 
tion  coefficient  less  than  one  just  in  case  the  graph  of 
the  matrix  is  weakly  connected.  Therefore,  M  is  com¬ 
plete  if  and  only  if  it  has  a  contraction  coefficient  less 
than  one.  ■ 

As  it  stands  the  proof  of  Theorem  1  rests  on  the 
assumption  that  7 (M)  has  a  weakly  connected  graph. 
Nedic  et  al.  [23]  derives  a  result  very  similar  to  Theorem  1 
under  what  at  first  glance  appears  to  be  a  more  restrictive 
assumption,  namely  that  7 (M)  has  a  strongly  connected 
graph.  But  because  doubly  stochastic  matrices  constitute  a 
special  class  of  stochastic  matrices,  the  assumption  of 
strong  connectivity  turns  out  to  be  not  restrictive  at  all. 
Here  is  why. 

Lemma  9:  The  graph  G  of  a  doubly  stochastic  matrix  D  is 
strongly  connected  if  and  only  if  it  is  weakly  connected.5 

The  proof  of  Lemma  9  which  follows  is  based  on  ideas 
from  [26]  and  [34].  Let  G  be  a  directed  graph  with  vertex 
set  V  =  {l,  2, .  .  .  ,  n}.  Call  a  vertex  j  reachable  from  i  if 
either  j  =  i  or  if  there  is  a  directed  path  from  i  to  j.  Call  a 
vertex  i  essential  if  i  is  reachable  from  all  vertices  which  are 
reachable  from  i. 

Lemma  10:  Every  directed  graph  has  at  least  one 
essential  vertex. 

Proof  of  Lemma  10:  Suppose  that  G  has  n  >  0 
vertices.  If  G  has  an  isolated  vertex  i,  then  i  is  essential. 
Consider  next  the  case  when  G  has  no  isolated  vertices  in 
which  case  n  >  1.  Suppose  that  G  has  no  essential  vertices. 
Then,  for  each  vertex  i  there  must  be  a  vertex  j  which  is 
reachable  from  i  but  from  which  i  is  not  reachable.  It 
follows  that  it  is  possible  to  construct  a  sequence  of  vertices 
ii,  i2,  ■  •  • ,  im  of  any  length  m  >  1  such  that  f+1  is  reachable 
from  /,  j  £  {l,  2, . . .  ,  m  —  1}  but  not  conversely.  But  for 
m  >  n  at  least  one  vertex  must  appear  in  the  list  twice,  say 
in  positions  j  and  k  >  j.  This  implies  that  vertex  j  +  1  is 
reachable  from  j  and  conversely  which  is  a  contradiction. 
Therefore,  G  must  have  an  essential  vertex.  ■ 

To  proceed,  let  us  say  that  vertices  i  and  j  are  mutually 
reachable  if  each  is  reachable  from  the  other.  Mutual 
reachability  is  clearly  an  equivalence  relation  on  V  which 
partitions  V  into  the  disjoint  union  of  a  finite  number  of 
equivalence  classes.  Note  that  if  i  is  an  essential  vertex  of 
G,  then  every  vertex  in  the  equivalence  class  of  i  is  also 
essential.  Thus,  every  directed  graph  possesses  at  least  one 
mutually  reachable  equivalence  class  whose  members  are 
all  essential. 

5It  is  clear  that  strong  connectivity  of  G  implies  weak  connectivity 
of  G.  The  converse  was  conjectured  by  John  Tsitsiklis  in  a  private 
communication. 
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Proof  of  Lemma  9:  Strong  connectivity  clearly  implies 
weak  connectivity.  We  prove  the  converse.  Suppose  G  is 
weakly  connected.  In  view  of  the  proceeding,  G  has  at 
least  one  mutually  reachable  equivalence  class  £  whose 
members  are  all  essential.  If  £  =  V,  then  G  is  obviously 
strongly  connected.  Thus,  to  prove  the  lemma,  it  is  enough 
to  show  that  £  =  V.  Suppose  the  contrary,  namely  that 
£  =  {h,  12 ,  •  •  ■ ,  im}  is  a  strictly  proper  subset  of  V.  Let  7r  be 
any  permutation  map  for  which  7r(ij)  =  j,  j  €  { 1,  2,  .  .  .  , 
m},  and  let  P  be  the  corresponding  permutation  matrix. 
Then,  clearly 


and  P'DP  is  doubly  stochastic.  Since  P'DP  is  doubly  sto¬ 
chastic,  the  column  sums  of  A  must  all  equal  one  as  must 
the  row  sums  of  the  submatrix  [A  B] .  But  the  transforma¬ 
tion  D i — >P'DP  corresponds  to  a  relabeling  of  the  vertices 
of  G,  so  the  graph  of  P'DP  must  also  be  weakly  connected. 
This  means  that  B  cannot  be  the  zero  matrix.  Therefore, 
the  sum  of  the  row  sums  of  A  must  be  less  than  m.  But  this 
contradicts  the  fact  that  the  sum  of  the  column  sums  of  A 
equals  m.  Therefore,  £  =  V.  ■ 

We  are  now  in  a  position  to  prove  Theorem  2. 

Proof  of  Theorem  2:  Suppose  that  M  is  a  complete 
gossip  matrix.  In  view  of  Lemmas  8  and  9,  y(M)  is  a 
strongly  connected  graph.  Thus,  by  Lemma  6,  the  magni¬ 
tude  of  the  second  largest  eigenvalue  of  M  is  less  than  1. 


To  prove  the  converse,  now  suppose  that  M  is  a 
gossip  matrix  whose  second  largest  eigenvalue  in  mag¬ 
nitude  is  less  than  one.  By  Lemma  6  the  graph  of  M  is 
therefore  weakly  connected.  Thus,  by  Lemma  8,  M  is 
complete.  ■ 

V.  CONCLUDING  REMARKS 

Let  N  be  a  given  neighbor  graph.  Recall  that  a  complete 
gossip  sequence  is  minimal  if  there  is  no  shorter  sequence 
of  gossips  which  is  compete.  It  is  easy  to  see  that  a  com¬ 
plete  gossip  sequence  will  be  minimal  if  and  only  if  the 
gossip  graph  it  induces  is  a  minimal  spanning  tree  in  N. 
For  a  given  neighbor  graph  there  can  be  many  minimal 
spanning  trees  and  consequently  many  minimally  com¬ 
plete  gossip  sequences.  Moreover,  there  can  be  differing 
second  largest  singular  values  and  different  second  largest 
eigenvalues  (in  magnitude)  for  the  different  doubly 
stochastic  matrices  associated  with  different  complete  mi¬ 
nimal  sequences.  A  useful  exercise  then  would  be  to  de¬ 
termine  those  complete  minimal  sequences  whose 
associated  singular  values  or  second  largest  eigenvalues 
(in  magnitude)  are  as  small  as  possible. 

One  of  the  problems  with  the  idea  of  gossiping,  which 
apparently  is  not  widely  appreciated,  is  that  it  is  difficult  to 
devise  provably  correct  gossiping  protocols  which  are  gua¬ 
ranteed  to  avoid  deadlocks  without  making  restrictive  as¬ 
sumptions.  The  research  in  this  paper  and  in  [12]  and  [13] 
contributes  to  our  understanding  of  this  issue  and  how  to 
deal  with  it. 
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Analysis  of  Target  Velocity  and  Position  Estimation  via  Doppler-Shift 

Measurements 


Iman  Shames,  Adrian  N.  Bishop,  Matthew  Smith  and  Brian  D.0.  Anderson 


Abstract —  This  paper  outlines  the  problem  of  doppler-based 
target  position  and  velocity  estimation  using  a  sensor  network. 
The  minimum  number  of  doppler  shift  measurements  at 
distinct  generic  sensor  positions  to  have  a  Unite  number  of 
solutions,  and  later,  a  unique  solution  for  the  unknown  target 
position  and  velocity  is  stated  analytically,  for  the  case  when 
no  measurement  noise  is  present.  Furthermore,  we  study  the 
same  problem  where  not  only  doppler  shift  measurements  are 
collected,  but  also  other  types  of  measurements  are  available, 
e.g.  bearing  or  distance  to  the  target  from  each  of  the  sensors. 
Subsequently,  allowing  nonzero  measurement  noise,  we  present 
an  optimization  method  to  estimate  the  position  and  the  velocity 
of  the  target.  An  illustrative  example  is  presented  to  show  the 
validity  of  the  analysis  and  the  performance  of  the  estimation 
method  proposed.  Some  concluding  remarks  and  future  work 
directions  are  presented  in  the  end. 

Index  Terms — Doppler  Measurements,  Localization,  Motion 
Estimation,  Polynomial  Optimization 


I.  Introduction 


Using  doppler-shifts  for  position  and  velocity  estimation 
has  a  long  history;  see  e.g.  [1] — [8].  Recently,  the  doppler 
effect  has  gained  a  renewed  interest  and  it  has  been  imple¬ 
mented  for  cooperative  positioning  in  vehicular  networks  [9] . 

In  this  paper,  we  consider  a  scenario  with  n  nodes  with 
both  transmitting  and  sensing  capabilities,  that  are  called 
sensors  for  the  rest  of  this  paper.  The  target  has  an  unknown 
position  and  velocity  x  =  [pT  vT]T  G  R4.  The  position  of 
the  non-collocated  sensors  is  given  by  s,;  =  [s^i  G  R2, 

Vi  G  {1, ... ,  n}. 

The  measured  doppler-shift  is  at  the  Th  sensor  and  is 
caused  by  a  target  reflection  due  to  a  signal  generated  earlier 
by  the  same  sensor.  This  frequency  shift  can  be  approximated 

by 

fi=fi  +  Wi  (la) 


( (p-sQT\ 

V  Up  —  sdl  ) 


V  +  Wi 


(lb) 


where  c  is  the  speed  of  light  (or  signal  propagation)  and  ||  |j 
is  the  standard  Euclidean  vector  norm  and  /Cj,  is  the  carrier 
frequency  employed  by  this  sensor.  Finally,  Wi  is  a  zero  mean 
Gaussian  random  variable  with  known  variance  of.  Note 
here  that  the  localization  is  to  be  achieved  instantaneously; 
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we  are  not  envisaging  collecting  information  from  agents  at 
a  number  of  successive  instants  of  time  and  using  them  to 
infer  position  at  a  single  instant  of  time  (The  connection  with 
filtering  methods  is  explored  further  below).  There  are  many 
studies  in  the  literature  that  try  to  solve  a  similar  problem 
via  collecting  measurements  over  a  time  interval  and  feeding 
them  into  an  estimator.  For  example,  in  [5]  the  problem  of 
localization  of  a  single  aircraft  using  doppler  measurements 
is  studied.  Other  similar  approaches  can  be  found  in  [4], 
[6]— [8] .  The  analysis  carried  out  in  this  paper  along  with 
the  optimization  method  proposed  can  be  considered  as 
constituting  a  batch  processing  method  for  instantaneous 
estimation  of  the  target  location  and  velocity.  This  in  turn 
is  used  to  initialize  and  improve  the  updates  of  any  other 
implemented  filter  which  tracks  the  target  position  as  the 
target  moves  in  the  environment.  For  example,  Kalman-based 
filters  are  prone  to  errors  when  the  target’s  motion  model 
deviates  significantly  from  the  actual  target  motion.  Our 
analysis  can  guard  against  such  behavior  and  re-initialize  the 
filter,  e.g.  see  [10]  and  references  therein. 


The  main  contribution  of  this  paper  is  that  the  minimum 
number  of  doppler  shift  measurements  required  to  have  a 
finite  number  of  solutions  for  the  unknown  x  is  algebraically 
derived.  In  some  scenarios,  a  separate  piece  of  knowledge 
about  the  target  will  allow  disambiguation.  Later,  it  is  ex¬ 
tended  to  the  case  where  having  a  unique  solution  is  required. 
Moreover,  the  scenarios  where  different  types  of  measure¬ 
ments,  e.g.  direction-of-arrival,  or  distance,  are  available 
in  addition  to  doppler  shift  measurements  are  considered. 
The  aforementioned  conclusions  assume  zero  measurement 
noise;  following  from  this,  an  optimization  method  based  on 
polynomial  optimization  methods  is  introduced  to  calculate 
the  velocity  and  the  position  of  a  target  where  noisy  doppler 
shift  measurements  are  available. 


The  remaining  sections  of  this  paper  are  organized  as 
follows.  In  the  next  section  the  main  problem  of  interest 
in  considered.  In  Section  III  the  case  where  different  types 
of  measurement  in  addition  to  doppler  shift  measurement 
are  available  to  the  sensors  is  considered.  A  method  based 
on  polynomial  optimization  to  estimate  the  position  and 
the  velocity  of  the  target  where  doppler  shift  measurements 
are  contaminated  by  noise  is  presented  in  Section  IV.  An 
illustrative  example  presenting  the  performance  of  the  pro¬ 
posed  optimization  method  is  given  in  Section  V.  Concluding 
remarks  and  future  directions  come  in  Section  VI. 


II.  Required  Minimum  Number  of  Doppler  Shift 
Measurements 


Initially,  it  is  assumed  fc,i  is  the  same  for  i  =  1 . . . . ,  n 
and  that  the  measurements  are  noiseless,  i.e.  Wi  =  0. 


fi  =  fi 

=  2VT(P-Si)  fc 

Up  -  s*ll  c  ’ 

Normalizing  so  that  2—  =  1,  we  obtain 

=  vT(p-sQ 

h  Up -sill 


(2) 


(3) 


Now  we  are  ready  to  pose  the  problem  of  interest  in  this 
section. 


dfi  {pi 

-  Siii){sit2Vi  -  Si}1v2  +  P1V2  -  P2V1) 

(6) 

dp2 

l|si  —  p||3 

dfi  _  (Pi  -  1) 

(7) 

dvi  ||  Si  p  1 1 

dfi  _  (P2  -  Si,l) 

(8) 

dv  2  ||  Si  p  1 1 

It  is  easy  to  check  that  for  generic  values  of  x,  s i,  i  = 
1,  •  •  ■  ,4,  Jp  is  not  singular.  Moreover,  we  know  that  the 
set  of  solutions  to  (2)  form  an  algebraic  variety.  The  reason 
for  this  is  that  while  (2)  is  not  a  polynomial  equation,  it  is 
easy  to  see  that  its  zeros  are  also  the  zeros  of  the  following 
polynomial  equation. 


Problem  1.  Consider  n  stationary  sensors  at  s,  £  R2 
capable  of  collecting  noiseless  doppler  shift  measurements 
from  a  target  at  position  p  moving  with  a  nonzero  velocity 
v  of  the  form  (2). 

1)  What  is  the  minimum  value  for  n  such  that  there  is  a 
finite  number  of  solutions  for  x? 

2)  What  is  the  minimum  value  for  n  such  that  there  is  a 
unique  solution  for  x? 

We  limit  out  analysis  to  the  case  where  the  nodes  and  the 
target  are  in  R2,  however,  note  that  the  analysis  for  the  case 
that  they  are  in  R3  is  much  the  same. 

The  answer  to  the  first  question  posed  in  Problem  1  is 
formally  presented  in  the  following  proposition  for  the  case 
where  n  =  4  doppler  shift  measurements  are  available.  This 
proposition  states  that  with  n  =  4  measurements  we  have  a 
finite  number  of  solutions  and  one  might  be  able  to  disam¬ 
biguate  the  solutions  using  other  measurements,  e.g.  range,  or 
bearing  measurements.  Moreover,  disambiguation  may  also 
be  made  possible  due  to  prior  measurements,  or  to  a  priori 
knowledge  about  the  geographic  constraints  on  targets  in  the 
area  of  interest.  This  particularly  is  important  for  the  cases 
where  the  possible  solutions  are  widely  separated,  e.g.,  see 
Fig.  1. 

Proposition  1.  For  n  =  4  doppler  measurements  as  de¬ 
scribed  by  (2)  and  generic  positions  of  the  sensors  there  is 
a  finite  number  of  solutions  for  the  unknown  x. 

Proof:  Denote  the  noiseless  mapping  from  the  agent 
position  and  velocity,  x  =  [pT  vT]T  (a  vector  in  R4)  to 
measurements  (another  vector  in  R" )  by  F,  where  n  is  the 
number  of  sensors.  More  specifically  F(x)  =  [fi  /2  fz,  /4]T- 
Denote  Jp  to  be  the  Jacobian  of  F: 


'  dfi 

dfi 

dfi 

dfi  1 

dpi 

dp2 

dvi 

dv2 

dfi 

dfi 

a/4 

a/4 

dpi 

dp2 

dvi 

dv2 

where 

dfj  _  ~{p2  ~  sij2){sit2Vi  -  Si, iv2  +P1V2  -P2V1) 
dpi  ||si  —  p||3 


/•  ||p-Si||2-(vT(p-si))2  =  0  (9) 

The  set  of  the  solutions  to  the  equations  described  by  (9) 
is  known  to  have  at  least  one  member;  that  is  the  solution 
corresponding  to  the  physical  setup.  The  nonsingularity  of 
the  Jacobian  implies  that  for  generic  values  for  measurements 
we  do  not  have  a  continuous  set  of  solutions.  As  a  result,  the 
variety  is  a  zero-dimensional  variety.  This  means  that  there 
is  a  finite  number  of  solutions  for  the  localization  problem 
using  doppler  measurements  where  n  >  4.  □ 

Now  we  briefly  consider  the  case  where  the  Jacobian 
matrix  Jp  is  singular.  This  corresponds  to  those  sensor 
and  target  geometries  where  there  is  an  infinite  number  of 
solutions  for  the  unknown  p  and  v.  We  call  these  geometries 
as  bad  geometries.  The  following  proposition  characterizes 
one  of  these  bad  geometries. 

Proposition  2.  The  Jacobian  matrix  Jp  (4)  is  singular  if  the 
sensors  si,  •  •  •  ,  S4  and  the  target  p  are  collinear. 

Proof:  To  prove  this  proposition  it  is  enough  to  evaluate 
(4)  for  the  case  where  si,  •  ■  •  ,  S4,  and  p  lie  on  the  same  line. 
The  calculations  are  trivial  and  are  omitted  for  brevity.  □ 

It  is  worthwhile  to  note  that  any  geometry  in  which  the 
sensors  and  the  target  are  almost  collinear  will  also  be 
problematic. 

After  establishing  that  there  is  a  finite  number  of  solutions 
for  generic  positions  of  the  sensors  where  four  doppler  shift 
measurements  are  available,  we  present  an  answer  to  the 
second  question  posed  in  Problem  1 .  Before  we  formally  pro¬ 
pose  the  answer,  note  that  the  number  of  unknowns  is  four, 
and  that  with  four  pieces  of  data  you  get  four  polynomials 
equations,  which  have  multiple  solutions  (though  they  may 
not  all  be  real).  However,  with  five  pieces  of  data,  one  expects 
that  the  associated  equations  to  have  a  unique  solution.  This 
solution  is  the  one  solution  common  to  two  selections  of 
four.  In  the  next  proposition  we  prove  that  the  position  and 
the  velocity  of  the  target  can  be  uniquely  calculated  if  there 
are  five  doppler  measurement  available. 

Proposition  3.  For  n  >  5  doppler  measurements  as  de¬ 
scribed  by  (2)  and  generic  positions  for  the  sensors  there  is 
a  unique  solution  for  the  unknown  x. 
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Fig.  1.  Placing  the  fifth  sensor  at  any  of  the  positions  indicated  by  *  results 
in  an  ambiguous  solution  for  the  position  of  the  target. 


Proof:  From  Proposition  1  we  know  that  for  n  =  4 
there  is  a  finite  number  of  solutions  for  x.  Call  the  solutions 
for  the  position  and  the  velocity  of  the  target  xi,  •  •  ■  ,  xm. 
Now,  temporarily  regard  the  position  of  sensor  5,  S5  = 
[s5,i  55,2]  as  an  unknown.  Consider  the  relationship  between 
each  of  these  solutions,  x,  =  [piti  pi2  1^,2] T,  and  the 
position  of  sensor  5,  S5: 

fl  (iPi,  1  -  S5,l)2  +  (Pi, 2  -  S5,2)2)  - 
{Vi,l{Pi,l  -  S5,l)  +Vil2(Pi,2  -  S5, 2))2  =  0 

This  equation  results  in  two  straight  lines  intersecting  at  p,, 
and  one  at  least  of  which  the  true  S5  must  lie,  namely  that 
associated  with  the  correct  target  position.  We  claim  that 
generically  only  one  such  equation  can  be  satisfied  by  the 
true  S5.  To  establish  the  claim,  we  argue  by  contradiction. 
Assume  that  Xj  and  x/,.  define  two  different  loci  on  which  S5 
must  lie.  These  loci  at  most  intersect  at  four  points.  So  for 
all  positions  of  S5  except  these  four  points,  x.;  and  x/i:  cannot 
be  simultaneously  the  solutions  to  the  localization  problem 
via  doppler  measurements.  With  similar  arguments  one  can 
eliminate  any  multiplicity  of  solutions  for  generic  values  of 
doppler  shift  measurements  for  n  >  5.  □ 

The  proof  of  Proposition  3  indicates  that,  given  four 
sensors  in  generic  positions  there  are  isolated  positions  where 
placing  a  fifth  sensor  does  not  lead  to  a  unique  solution 
for  x.  An  example  of  this  scenario  is  depicted  in  Fig.  1, 
where  two  possible  solutions  or  the  location  of  the  target 
are  shown.  Moreover,  four  positions  are  identified  such  that 
the  placement  of  a  fifth  sensor  at  any  of  these  positions 
will  not  resolve  the  ambiguity  in  the  target  position.  An 
interesting  direction  for  future  work  would  be  to  consider 
the  notion  of  optimal  sensor  placement  as  discussed  in,  e.g., 
[11],  [12]  which  not  only  considers  ambiguities  but  also  the 
likely  estimator  performance  given  the  relative  sensor-target 
geometry. 

So  far,  we  have  considered  the  case  where  the  carrier 
frequency,  fc,  is  known.  In  some  cases  this  is  not  realistic;  in 
what  comes  next  we  show  that  there  is  no  way  from  a  single 


set  of  instantaneous  doppler  only  measurements  one  can 
separate  fc  and  v.  We  conclude  this  section  by  formalizing 
this  fact  and  presenting  a  result  on  the  case  where  the  carrier 
frequency  is  unknown. 


Proposition  4.  For  n  >  5  doppler  measurements  as  de¬ 
scribed  by  (2)  and  generic  positions  for  the  sensors,  and 
unknown  carrier  frequency  fc,  there  is  a  unique  solution  for 
the  unknown  p  and  for  the  vector  /cv. 


Proof:  Define  v  =  [i>i  v2\T  =  /cv.  For  (2)  we  have 

vT{p~-  Sj) 


fi  =  2- 


c||p  -  Si|| 


(11) 


From  Proposition  3  we  know  that  using  five  equations  of  the 
form  (11),  p  and  v  can  be  calculated  uniquely.  It  follows 
easily  that  if  one  does  not  know  fc  nor  v  and  one  estimates 
v  =  fcy  that  for  any  chosen  fc  (or  v)  there  is  a  subsequent 
value  of  v  (or  fc)  that  satisfies  v  =  /cv.  Thus,  fc  and  v 
cannot  be  calculated  separately. 


Remark  1.  By  using  the  calculated  value  p  in  consecutive 
time  steps,  one  can  estimate  the  velocity  of  the  agent.  Using 
this  estimated  velocity  and  knowing  v,  one  can  further 
estimate  fc. 


III.  Required  Minimum  Number  of  Hybrid 
Measurements 

In  this  section  we  study  the  effect  of  having  other  types  of 
measurements  additional  to  doppler  shift  measurements  on 
calculating  the  velocity  and  the  position  of  the  target. 

Before  continuing  further,  we  note  that  to  have  a  unique 
solution  for  a  set  of  polynomial  equations  there  is  usually 
a  need  to  have  more  equations  than  unknowns.,  save  in 
cases  where  the  equations  are  linear.  However,  having  more 
equations  than  unknowns  does  not  of  itself  guarantee  the 
existence  of  a  unique  solution;  the  extra  equation  must  in 
some  way  be  independent,  and  it  may  have  this  property 
in  almost  all  circumstances,  i.e.  generically,  but  not  always. 
In  this  section  we  establish  the  cases  where  it  can  be 
mathematically  shown  that  a  unique  solution  for  x  exists 
when  a  combination  of  doppler  and  other  measurements  is 
available. 

First,  we  consider  the  case  where  in  addition  to  the  doppler 
shift  measurements  described  earlier,  the  distances  between 
each  of  the  sensors  and  the  target  can  be  measured  as  well. 
Denote  the  distance  between  the  sensor  i  and  the  target  as 
di  where 

di  =  ||p -Sj||,  i  =  1,  -  ■■  ,  m  (12) 

We  have  the  following  results: 

Proposition  5.  For  n  >  2  doppler  measurements  as  de¬ 
scribed  by  (2)  and  m  >  3  distance  measurements  (12)  and 
generic  positions  of  the  sensors  there  is  a  unique  solution 
for  the  unknown  x. 

Proof:  With  three  or  more  distance  measurements  the 
position  of  the  target  can  be  determined  uniquely;  i.e.  three 
generic  circles  have  at  most  a  single  point  of  intersection. 


Knowing  the  position,  then  the  doppler  equations  are  linear 
equations  in  the  velocity  of  the  target.  Hence,  a  unique 
solution  for  the  velocity  follows.  □ 

Proposition  6.  For  n  >  3  doppler  measurements  as  de¬ 
scribed  by  (2)  and  two  distance  measurements  (12)  and 
generic  positions  of  the  sensors  there  is  a  unique  solution 
for  the  unknown  x. 

Proof:  The  two  distance  measurements  pin  down  the 
target  to  a  binary  ambiguity;  i.e.  two  generic  circles  have 
two  points  of  intersection.  Using  each  of  these  positions, 
then  from  the  doppler  equations  we  obtain  two  sets  of  three 
linear  equations  in  the  velocity  of  the  target.  Generically, 
only  one  of  these  sets  forms  a  consistent  system  of  linear 
equations.  □ 

We  now  consider  the  case  that  instead  of  distance  mea¬ 
surements,  bearing  measurements  to  the  target  at  each  of  the 
sensors  are  available,  viz. 

Pi  =  [pi,  l  Pi,  2]T  =  fp — \  (13) 

Up -Sill 

is  known  at  each  sensor  i.  We  formally  consider  this  scenario 
in  the  next  proposition. 

Proposition  7.  For  n  >  2  doppler  measurements  as  de¬ 
scribed  by  (2)  and  bearing  measurements  (13)  there  is  a 
unique  solution  for  the  unknown  x. 

Proof:  With  at  least  two  bearing  measurements  the 
position  of  the  target  can  be  determined  uniquely.  Knowing 
the  position,  then  the  doppler  equations  are  linear  equations 
in  the  velocity  of  the  target.  Hence,  a  unique  solution  for  the 
velocity  follows.  □ 

In  what  comes  next  we  consider  the  case  where  only  one 
of  the  sensors  is  equipped  with  the  capability  to  measure  the 
target  bearing.  Without  loss  of  generality  assume  that  only 
sensor  1  can  collect  a  bearing  measurements  to  the  target  in 
addition  to  the  doppler  shift  measurement.  The  rest  of  the 
sensors  can  only  measure  the  doppler  shift.  For  this  scenario 
we  have  the  following  result. 

Proposition  8.  For  n  >  4  doppler  measurements  as  de¬ 
scribed  by  (2)  and  only  one  measured  bearing  to  the  target 
there  is  a  unique  solution  for  the  unknown  x. 


Proof:  Without  loss  of  generality  assume  that  the 
bearing  measurement  is  measured  at  si.  For  the  doppler 
measurement  at  1  we  have 


fl  =  2vTp!  — 

c 

fl  =  2(l>iPi  1  +  v2pi,2)  — 

c 

that  is  a  linear  equation  in  v.  Moreover,  we  have 


(14) 


P2  =  Sl,2  +  Pl,2 -  (15) 

P  1,1 

Calculating  v2  in  terms  of  v\  from  (14)  and  p2  in  terms  of 
pi  from  (14)  and  replacing  them  in 


T_p_s x_fc 
IIP -Si||  c 


we  obtain  three  quadratic  equations  in  Vi  and  p\  only. 
Furthermore,  it  is  known  that  generically  three  quadratic 
equations  in  two  variables  have  a  unique  solution.  Hence, 
x  can  be  determined  uniquely.  □ 

In  the  next  section,  we  introduce  an  algorithm  to  estimate 
the  positon  and  the  velocity  of  the  target  using  the  doppler- 
shift  measurements  measured  at  each  of  the  sensor. 


IV.  An  Algorithm  to  Estimate  the  Position  and 
the  Velocity  of  the  Target 


To  calculate  the  position  and  the  velocity  of  the  target  in 
the  noiseless  case  it  is  enought  to  solve  the  following  system 
of  equations. 


vT(p-st) 

Up -Sill 


=  0  i  =  1, . . . ,  n 


(17) 


cfi 

where  6,  =  Equivalently  due  to  the  fact  that  the  target 

"«/  C 

and  the  sensors  are  not  collocated  (i.e.  \\p  —  Si||  is  nonzero), 
instead  of  solving  (17)  one  can  solve: 


<5j||p  -  Si||  -  vT(p  —  =  0  i  =  l,...,n,  (18) 


or 

SilJ (p  -  Si)T(p  -  Si)  -  vT(p  -  Si)  =  0  i  =  1, ...  ,n. 

(19) 

Having  the  square-root  in  (19)  makes  it  undesirable  for  solv¬ 
ing  numerically.  Hence,  instead  we  consider  the  following  set 
of  equations. 


^2(p-Si)T(p-Si)  -  (vT(p-s;))2  =  0  i  = 

(20) 

Note  that  any  solution  of  (19)  is  a  solution  of  (20)  but  not 
vice  versa.  Assume  x*  =  [p*T  v*T]T  is  a  solution  to  (19); 
then  both  x^  =  [p*T  v*T]T  and  Xj  =  [p*T  —  v*T]T 
satisfy  (20).  From  Proposition  3  we  know  that  for  n  >  5 
(19)  has  a  unique  solution.  Then  it  easily  follows  that  for 
n  >  5,  (20)  has  exactly  two  solutions.  Replacing  p  by  p*  in 
(17)  results  in  a  set  of  linear  equations  in  v  where  only  one 
of  the  values  of  v*  or  — v*  satisfies  it.  The  aforementioned 
analysis  of  the  solutions  for  x  in  the  noiseless  case  form  the 
basis  of  the  algorithm  proposed  in  this  section. 

Now  we  consider  the  case  where  the  doppler  shift  mea¬ 
surement  is  corrupted  by  noise.  That  is,  each  sensor  measures 
fi  =  fi  +  Wi  where  Wi  corresponds  to  the  noise  in  the 

?  C  fi 

measurement  carried  out  by  sensor  i.  Setting  d,  =  — — ,  we 

"/c 

have 

?  v  (P~s i) 

°i  =  -ji - M--  (21) 

P-Si 


ys,  CUJ  ■ 

Note  that  Si  =  6i  +  oj,;,  where  u>i  =  —r.  In  the  noisy  case 

"ic 

neither  (19)  nor  (20)  has  a  solution  for  x.  Instead  we  propose 
solving  the  following  minimization  problem  to  calculate  the 
position  and  the  velocity  of  the  target.  Solution  of  similar 
minimization  problems  when  range,  range-difference,  and 
bearing  measurments  are  available  are  studied  in  [13],  [14]. 


[P*,v*]  =argmin  F(p,v), 

P,V 


fi  =  2v 


*  =  2,3,4 


(16) 


(22) 


Example  Setting 


where  F(p,v)  =  E  (j>i  IIP  ~  si||2  “  (vT(p  -  Sj))2)  . 

The  advantage  of  having  such  a  cost  function  is  that  it  is 
a  polynomial  in  the  unknowns,  and  can  be  minimized  using 
modern  polynomial  optimization  methods,  e.g.  see  [15],  [16]. 
By  solving  this  minimization  problem  we  obtain  two  values 
for  the  position  and  the  velocity  of  the  target,  viz.  (pi,p£) 
and  (vjyvj),  where  p[  =  p£  and  v*  =  — v£.  To  find  the 
correct  value  for  the  velocity  as  before  we  replace  p  by 
P*  —  Pi  =  P2  in  (21)  to  obtain  a  set  of  linear  equations  in 


v: 


=  vT(P*  ~  sQ 
Up* -Sill 


(23) 


It  is  easy  to  check  that  the  linear  system  of  equations  (23) 
is  over-determined  and  inconsistent,  and  cannot  be  solved. 
However,  a  least-square  solution  to  this  syetem  of  linear 
equations  can  be  obtained.  The  least  squares  solution  to  this 
system  of  equations  is  the  estimated  value  for  the  target 
velocity,  v*.  This  procedure  is  outlined  in  Algorithm  1. 


Fig.  2.  The  setting  considered  in  the  illustrative  example.  The  squares 
denote  the  position  of  the  sensors  and  the  diamond  is  the  target. 


Algorithm  1  Target  Position  and  Velocity  Estimation  Us¬ 
ing  n  Doppler-shift  Measurements  Collecting  at  Nodes 
(1, . . . ,  n)  in  a  Sensor  Network  Via  Polynomial  Optimization. 
Input:  <5i,...,<5„ 

Output:  p*,  v* 

Require:  At  least  5  sensors  at  generic  positions. 

F(pa)  <-  E  (X2||P-Si||2  -  (vT(p  -s,))2) 

[p*,  v*]  •<—  argmin  F(p,v) 

p.v 

for  i  =  1  :  n  do 

P*  “  Si 


A  <—  [pi,..  ■ ,  pi]T 

b  [<5l;  "  •  j  ^n]T 

v*  <r-  A'h  {A*  is  the  pseudo-inverse  of  A.} 

return  p*  and  v* 


Position  Estimate  Error 


Fig.  3.  The  error  in  the  estimate  of  the  position  of  the  target  after  repeating 
the  scenario  20  times  with  variable  noise  variances  when  six  and  seven 
sensors  are  used. 


V.  Illustrative  Example 

In  this  section  we  demonstrate  the  performance  of  the  al¬ 
gorithm  introduced  in  Section  IV  to  estimate  the  position  and 
the  velocity  of  the  target.  We  consider  the  setting  depicted 
in  Fig,  2.  We  consider  the  case  where  the  measurements 
are  corrupted  by  ten  different  levels  of  noise,  where  the 
noise  is  assumed  to  be  gaussian  with  zero  mean  and  variable 
variance  (and  independent  at  each  sensor).  The  error  in  the 
estimates  of  position  and  the  velocity  for  different  noise 
levels  after  repeating  the  scenario  for  twenty  times  when 
six  and  seven  measurements  are  used  are  depicted  at  Fig. 
3  and  Fig.  4  respectively.  Moreover,  note  that  the  setting 
presented  in  Fig.  2  shows  elements  of  a  bad  geometry  as 
well:  sensors  at  positions  si,  S2,  and  S3  and  the  target 
are  nearly  collinear.  However,  here  due  to  the  presence  of 
other  sensors  at  non-collinear  positions  the  estimation  can 
be  carried  out  effectievely. 


Velocity  Estimate  Error 


Fig.  4.  The  error  in  the  estimate  of  the  velocity  of  the  target  after  repeating 
the  scenario  20  times  with  variable  noise  variances  when  six  and  seven 
sensros  are  used. 


VI.  Concluding  Remarks  and  Future  Directions 
The  minimum  number  of  doppler  shift  measurements  nec¬ 
essary  to  have  a  finite  number  of  solutions  for  the  unknown 
target  position  and  velocity  is  calculated  analytically  via 
algebraic  arguments.  Additionally,  we  stated  the  necessary 
and  sufficient  number  of  generic  measurements  to  have  a 
unique  solution  for  the  target  parameters.  Later,  the  same 
problem  has  been  studied  where  in  addition  to  doppler  shift 
measurements,  other  types  of  measurements  are  available, 
e.g.  bearing  or  distance  to  the  target  from  each  of  the 
sensors.  Finally,  a  method  based  on  polynomial  optimization 
is  introduced  to  calculate  an  estimate  for  the  position  and 
the  velocity  of  the  target  using  noisy  doppler  shift  measure¬ 
ments.  A  numerical  example  is  presented  to  demonstrate  the 
performance  of  this  algorithm. 

A  possible  future  research  direction  is  to  design  a  dy¬ 
namical  estimator  to  estimate  the  position  and  the  velocity 
of  the  target  measuring  the  doppler  shift  measurements 
continuously.  Alternatively,  one  could  consider  the  notion  of 
constraint-based  optimization  for  localization  as  discussed  in, 
e.g.,  [17]— [22], 
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Abstract 

This  paper  outlines  the  problem  of  doppler-based  target  position  and  velocity  estimation  using  a  sensor  network.  The  minimum 
number  of  doppler  shift  measurements  at  distinct  generic  sensor  positions  to  have  a  finite  number  of  solutions,  and  later,  a  unique 
solution  for  the  unknown  target  position  and  velocity  is  stated  analytically.  Furthermore,  we  study  the  same  problem  where  not 
only  doppler  shift  measurements  are  collected,  but  also  other  types  of  measurements  are  available,  e.g.  bearing  or  distance  to  the 
target  from  each  of  the  sensors.  Later,  we  study  the  Cramer-Rao  inequality  associated  with  the  doppler-shift  measurements  to  a 
target  in  a  sensor  network,  and  use  the  Cramer-Rao  bound  to  illustrate  some  results  on  optimal  placements  of  the  sensors  when 
the  goal  is  to  estimate  the  velocity  of  the  target.  Some  simulation  results  are  presented  in  the  end. 

Index  Terms 

Doppler  Measurements,  Location  Estimation,  Doppler  Localization,  Cramer-Rao  Inequality,  Fisher  Information  Matrix 


I.  Introduction 


Using  doppler-shifts  for  position  and  velocity  estimation  has  a  long  history;  see  e.g.  [1] — [8].  Recently,  the  doppler  effect 
has  gained  a  renewed  interest  and  it  has  been  implemented  for  cooperative  positioning  in  vehicular  networks  [9], 

In  this  paper,  we  consider  a  scenario  with  n  nodes  with  both  transmitting  and  sensing  capabilities,  that  are  called  sensors  for 
the  rest  of  this  paper.  The  target  has  an  unknown  position  and  velocity  x  =  [pT  vT]T  £  R4.  The  position  of  the  non-collocated 
sensors  is  given  by  s *  =  [s^i  s,:,2]T  £  R2,  Vi  £  {1, . . . ,  n}. 

The  measured  doppler-shift  is  St  at  the  ith  sensor  and  is  caused  by  a  target  reflection  due  to  a  signal  generated  earlier  by 
the  same  sensor.  This  frequency  shift  can  be  approximated  by  [10] 


24i  f(p-s»n 

c  V  Up -Sill  ) 


V  +  Wi 


(la) 

(lb) 


where  c  is  the  speed  of  light  (or  signal  propagation)  and  ||  ■  ||  is  the  standard  Euclidean  vector  norm  and  f,..,  is  the  carrier  frequency 
employed  by  this  sensor.  Finally,  Wi  is  the  noise  variable.  Note  here  that  the  localization  is  to  be  achieved  instantaneously; 


I.  Shames  is  with  the  ACCESS  Linnaeus  Centre,  Royal  Institute  of  Technology  (KTH)  in  Stockholm,  Sweden,  e-mail:  imansh@kth.se. 
A.N.  Bishop  and  B.D.O.  Anderson  are  with  NICTA,  Canberra  Research  Lab  and  the  Australian  National  University  (ANU),  e-mail: 
{ Adrian. Bishop, Brian. Anderson}  @ anu.edu.au.  M.  Smith  is  with  CEA  Technologies,  Canberra,  Australia,  e-mail:  Matthew.Smith@cea.com.au.  I.  Shames 
is  supported  by  the  Swedish  Research  Council  and  the  Knut  and  Alice  Wallenberg  Foundation.  A.N.  Bishop  and  B.D.O.  Anderson  are  supported  by  USAF- 
AOARD- 10-4 102  and  by  NICTA  which  is  funded  by  the  Australian  Government  as  represented  by  the  Department  of  Broadband,  Communications  and  the 
Digital  Economy  and  the  Australian  Research  Council  through  the  ICT  Centre  of  Excellence  program.  A.N.  Bishop  is  also  supported  by  the  Australian 
Research  Council  (ARC)  via  a  Discovery  Early  Career  Researcher  Award  (DE-120102873).  M.  Smith  is  supported  by  CEA  Technologies. 


SUBMITTED  TO  IEEE  TRANSACTIONS  ON  AEROSPACE  AND  ELECTRONIC  SYSTEMS,  DRAFT  MARCH  26,  2012 


2 


we  are  not  envisaging  collecting  information  from  agents  at  a  number  of  successive  instants  of  time  and  using  them  to  infer 
position  at  a  single  instant  of  time  (The  connection  with  filtering  methods  is  explored  further  below).  There  are  many  studies 
in  the  literature  that  try  to  solve  a  similar  problem  via  collecting  measurements  over  a  time  interval  and  feeding  them  into  an 
estimator.  For  example,  in  [5]  the  problem  of  localization  of  a  single  aircraft  using  doppler  measurements  is  studied.  Other 
similar  approaches  can  be  found  in  [4],  [6]— [8] .  The  analysis  carried  out  in  this  paper  along  with  the  optimization  method 
proposed  can  be  considered  as  constituting  a  batch  processing  method  for  instantaneous  estimation  of  the  target  location  and 
velocity.  This  in  turn  might  be  used  to  initialize  and  improve  the  updates  of  any  other  implemented  filter  which  tracks  the  target 
position  as  the  target  moves  in  the  environment.  For  example,  Kalman-based  filters  are  prone  to  errors  when  the  target’s  motion 
model  deviates  significantly  from  the  actual  target  motion.  Our  analysis  can  guard  against  such  behavior  and  re-initialize  the 
filter,  e.g.  see  [11]  and  references  therein. 

The  first  main  contribution  of  this  paper  is  that  the  minimum  number  of  doppler  shift  measurements  required  to  have  a  finite 
number  of  solutions  for  the  unknown  x  is  algebraically  derived.  In  some  scenarios,  a  separate  piece  of  knowledge  about  the 
target  will  allow  disambiguation.  Later,  the  result  is  extended  to  the  case  where  having  a  unique  solution  is  required.  Moreover, 
the  scenarios  where  different  types  of  measurements,  e.g.  direction-of-arrival,  or  distance,  are  available  in  addition  to  doppler 
shift  measurements  are  considered.  The  aforementioned  conclusions  assume  zero  measurement  noise;  following  on  from  this, 
an  optimization  method  based  on  polynomial  optimization  methods  is  introduced  to  calculate  the  velocity  and  the  position  of  a 
target  where  noisy  doppler  shift  measurements  are  available.  Later,  we  calculate  the  Cramer-Rao  inequality  for  different  sensor 
network  configurations  and  present  some  discussions  of  the  Fisher  Information  matrix,  which  in  turn  leads  to  the  introduction 
of  optimal  sensor  placement  in  the  scenarios  where  a  network  of  sensors  gather  doppler-shift  measurements  in  the  signal  from 
a  target. 

The  remaining  sections  of  this  paper  are  organized  as  follows.  In  the  next  section  the  main  problem  of  interest  in  considered. 
In  Section  III  the  case  where  different  types  of  measurement  in  addition  to  doppler  shift  measurement  are  available  to  the 
sensors  is  considered.  A  method  based  on  polynomial  optimization  to  estimate  the  position  and  the  velocity  of  the  target  where 
doppler  shift  measurements  are  contaminated  by  noise  is  presented  in  Section  IV.  The  Cramer-Rao  inequality  for  the  scenario 
considered  here  is  calculated  in  Section  V.  In  Section  VI  some  insights  into  the  Fisher  Information  matrix  for  the  velocity 
estimation  are  presented.  An  illustrative  example  presenting  the  performance  of  the  proposed  optimization  method  is  given  in 
Section  VII.  Concluding  remarks  and  future  directions  come  in  Section  VIII. 

II.  Required  Minimum  Number  of  Doppler  Shift  Measurements 

Initially,  it  is  assumed  that  the  measurements  are  noiseless,  i.e.  Wi  =  0. 

Si  =  Si 

=  VT(P-S j)  fc,i 

Up -s^i  c 


(2) 
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Normalizing  so  that  f,  =  we  obtain 

*jc,i 


U 


vT(P-s») 

Up -^11 


Now  we  are  ready  to  pose  the  problem  of  interest  in  this  section. 


(3) 


Problem  1.  Consider  n  stationary  sensors  at  s i  G  R2  capable  of  collecting  noiseless  doppler  shift  measurements  from  a  target 
at  position  p  moving  with  a  nonzero  velocity  v  of  the  form  (2). 


1)  What  is  the  minimum  value  for  n  such  that  there  is  a  finite  number  of  solutions  for  x? 

2)  What  is  the  minimum  value  for  n  such  that  there  is  a  unique  solution  for  x? 


We  limit  our  analysis  to  the  case  where  the  nodes  and  the  target  are  in  R2,  however,  note  that  the  analysis  for  the  case  that 
they  are  in  R3  is  much  the  same.  First  we  have  the  following  remark. 

Remark  1.  Throughout  this  paper,  when  it  is  stated  that  a  property  is  held  for  generic  positions  of  the  sensors,  it  is  meant 
that  such  a  property  holds  for  all  positions  of  the  sensors  except  for  sensor  positions  in  a  set  of  measure  zero. 


The  answer  to  the  first  question  posed  in  Problem  1  is  formally  presented  in  the  following  proposition  for  the  case  where 
n  =  4  doppler  shift  measurements  are  available.  This  proposition  states  that  with  n  =  4  measurements  generically  1  we  have  a 
finite  number  of  solutions  and  one  might  be  able  to  disambiguate  the  solutions  using  other  measurements,  e.g.  range,  or  bearing 
measurements.  Moreover,  disambiguation  may  also  be  made  possible  due  to  prior  measurements,  or  to  a  priori  knowledge  about 
the  geographic  constraints  on  targets  in  the  area  of  interest.  This  particularly  is  important  for  the  cases  where  the  possible 
solutions  are  widely  separated,  e.g.,  see  Fig.  1.  Hence  this  situation  is  of  potential  practical  interest. 

Proposition  1.  For  n  =  4  doppler  measurements  as  described  by  (2)  and  generic  positions  of  the  sensors  there  is  a  finite 
number  of  solutions  for  the  unknown  x. 


Proof:  Denote  the  noiseless  mapping  from  the  agent  position  and  velocity,  x  =  [pT  vT]T  (a  vector  in  R4)  to  measurements 
(another  vector  in  R")  by  F,  where  n  is  the  number  of  sensors.  More  specifically  F(x)  =  [fi  fj,  ff\T.  Let  VF  denote  the 
Jacobian  of  F: 


dfi 

dfi 

dfi 

dh 

dpi 

dp2 

dvi 

dv  2 

(4) 

dfi 

dfi 

<9/4 

dh 

dpi 

dp2 

dvi 

dv  2 

where 

dfj  _  ~(j>2  ~  Sit2)(Si,2Vl  -  sit  1V2  +P1V2  -P2V1) 
dpi  I|si-Pll3 


dfj  _  {pi  -  Sjp)(Sit2Vl  -  Si,  1V2  +  P1V2  ~  P2V1) 

dpi  l|si~Pll3 

’There  will  be  special  positions  of  sensors  for  which  x  cannot  be  determined-for  example  if  they  are  all  collocated.  However,  the  set  of  such  exceptional 
positions  is  a  set  of  measure  zero;  the  word  ’generically’  captures  this  notion. 
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dfi  =  ( Pi  -  3»,i)  m 

dvi  ||sj  -  p|| 

dfi  =  (p 2  -  Si, 2) 

dv2  ||Sj-p|| 

It  is  easy  to  check  that  for  generic  values  of  x,  s,,  i  =  1,  •  •  •  ,  4,  V F  is  not  singular.  Moreover,  we  know  that  the  set  of  solutions 
to  (2)  form  an  algebraic  variety  ,  [12],  i.e.  the  solution  set  can  be  defined  by  solving  a  set  of  multivariable  polynomial  equations. 
The  reason  for  this  is  that  while  (2)  is  not  a  polynomial  equation,  it  is  easy  to  see  that  its  zeros  are  also  the  zeros  of  the 
following  polynomial  equation. 

ft  ||p-s4||2-(vT(p-s4))2=0  (9) 

The  set  of  the  solutions  to  the  equations  described  by  (9)  is  known  to  have  at  least  one  member;  that  is  the  solution  corresponding 
to  the  physical  setup.  The  nonsingularity  of  the  Jacobian  implies  that  for  generic  values  for  measurements  we  do  not  have  a 
continuous  set  of  solutions.  As  a  result,  the  variety  is  a  zero-dmensional  variety  [12],  Further,  any  zero-dimensional  variety 
has  a  finite  number  of  points  [12],  and  so  the  solution  set  as  a  subset  of  a  zero-dimensional  variety  also  has  a  finite  number 
of  points,  i.e.,  there  is  a  finite  number  of  solutions  for  the  localization  problem  using  doppler  measurements  where  n  >  4.  ■ 

Now  we  briefly  consider  the  case  where  the  Jacobian  matrix  V F  is  singular.  This  corresponds  to  those  sensor  and  target 
geometries  where  there  is  an  infinite  number  of  solutions  for  the  unknown  p  and  v.  We  call  these  geometries  as  degenerate 
geometries.  The  following  proposition  characterizes  two  of  these  degenerate  geometries. 

Proposition  2.  The  Jacobian  matrix  V  F  given  in  (4)  is  singular  in  the  following  situations: 

1)  If  any  pair  of  the  sensors  sj,  ■  ■  •  ,  S4  and  the  target  p  are  collinear. 

2)  //v  =  0. 

Proof:  To  prove  the  first  statement  of  this  proposition  it  is  enough  to  evaluate  (4)  for  the  case  where  two  of  the  sensors 
Si,  •  •  •  ,  S4,  and  p  lie  on  the  same  line.  The  calculations  are  trivial  and  are  omitted  for  brevity.  For  establishing  the  validity  of 
the  second  statement  it  is  enough  to  observe  that  the  first  two  columns  of  (4)  are  zero  when  v  =  0.  ■ 

It  is  worthwhile  to  note  that  in  practice  any  geometry  in  which  the  sensors  and  the  target  are  almost  collinear  will  also  be 
problematic  due  to  the  presence  of  noise  in  the  measurements. 

After  establishing  that  there  is  a  finite  number  of  solutions  for  generic  positions  of  the  sensors  where  four  doppler  shift 
measurements  are  available,  we  present  an  answer  to  the  second  question  posed  in  Problem  1.  Before  we  formally  propose 
the  answer,  note  that  the  number  of  unknowns  is  four,  and  that  with  four  pieces  of  data  we  have  four  polynomials  equations, 
which  in  general  have  a  finite  set  multiple  solutions  (though  they  may  not  all  be  real).  However,  with  five  pieces  of  data, 
one  expects  that  the  associated  equations  generically  have  a  unique  solution.  This  solution  is  the  one  solution  common  to  two 
selections  of  four.  In  the  next  proposition  we  prove  that  the  position  and  the  velocity  of  the  target  can  be  uniquely  calculated 
if  there  are  five  doppler  measurement  available. 

Proposition  3.  For  n  >  5  doppler  measurements  as  described  by  (2)  and  generic  positions  for  the  sensors  there  is  a  unique 
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Insert  Figure  1  here 


Fig.  1.  Placing  the  fifth  sensor  at  any  of  the  positions  indicated  by  *  results  in  an  ambiguous  solution  for  the  position  of  the  target. 


solution  for  the  unknown  x. 


Proof:  From  Proposition  1  we  know  that  for  n  =  4  there  is  a  finite  number  of  solutions  for  x,  m  say.  Call  the  solutions 
for  the  position  and  the  velocity  of  the  target  xi,  •  •  •  ,  xm.  Now,  temporarily  regard  the  position  of  sensor  5,  s.5  =  [ss,i  55,2]  as 
an  unknown.  Consider  the  relationship  between  each  of  these  solutions,  x.(  =  [p^i  pi  2  Vit 2] T,  and  the  position  of  sensor 

5,  s5: 


(10) 


/f  ({Pi,  1  —  S5,l)2  +  (Pi, 2  —  S5,2)2)  - 
(Vi,l{Pi,l  -  S5>i)  +  Vit2(Pi, 2  -  S5> 2))2  =  0 

This  equation  (regarded  as  an  equation  for  the  two  temporarily  unknown  coordinates  S51  and  55^,  results  in  two  straight 
lines  intersecting  at  p, ;  the  true  S5  must  lie  on  one  of  these  straight  line  pairs,  namely  that  associated  with  the  correct  target 
position.  We  claim  that  generically  only  one  such  equation  can  be  satisfied  by  the  true  S5.  To  establish  the  claim,  we  argue  by 
contradiction.  Assume  that  x.j  and  x/c  define  two  different  loci  on  both  of  which  S5  lies.  These  loci  at  most  intersect  at  four 
points.  So  for  all  positions  of  S5  except  these  four  points,  x;/  and  x;c  cannot  be  simultaneously  the  solutions  to  the  localization 
problem  via  doppler  measurements.  With  similar  arguments  one  can  eliminate  any  multiplicity  of  solutions  for  generic  values 
of  doppler  shift  measurements  for  n  >  5.  ■ 


The  proof  of  Proposition  3  indicates  that,  given  four  sensors  in  generic  positions  there  are  isolated  positions  where  placing 
a  fifth  sensor  does  not  lead  to  a  unique  solution  for  x.  An  example  of  this  scenario  is  depicted  in  Fig.  1,  where  two  possible 
straight  line  pairs  are  shown,  being  determined  by  two  of  the  possible  finite  target  positions  computed  using  the  measurements 
from  sensors  1  to  4.  Moreover,  four  positions  are  identified  such  that  the  placement  of  a  fifth  sensor  at  any  of  these  positions 
will  not  resolve  the  ambiguity  in  the  target  position. 


So  far,  we  have  considered  the  case  where  each  sensor  i  emits  a  signal  with  a  known  carrier  frequency  fci.  In  what  comes 
next  we  consider  two  other  cases.  First,  we  consider  a  scenario  that  occurs  commonly.  In  this  scenario  an  illuminating  emitter 
and  a  receiver  are  colocated  and  all  the  other  sensors  only  receive  the  reflection  of  the  signal  emitted  by  the  first  sensor  off 
the  target.  Suppose  the  illuminating  radar  and  the  first  receiver  are  colocated  at  si  and  the  remaining  sensors  are  located  at 
s2, . .  • ,  S(.  Let  fc  be  the  frequency  of  the  illuminating  radar.  Then  for  i  =  1  there  holds 


<5i  =  2 


fc  VT(P  -  Si) 

c  Up -si|| 


(ii) 


and  for  i  =  2, . . . ,  i,  [10] 


'  vT (p  -  si) 

.  Up -sill 


VT(p  ~  S,;) 

Up  —  Si|| 


(12) 
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We  note  that  from  this  data,  it  is  trivial  to  compute  the  set  of  values 

7  _  VT(P~S») 
•'*  ll„  „  II 


for  i  =  1,2, where 


-  A  c(2Sj  -  5 1) 

U  ~  2/c 


i  =  1,2,...,*. 


In  other  words,  from  a  purely  mathematical  point  of  view,  this  scenario  can  be  reduced  to  the  original  one,  where  all  the  /c  j 
assume  the  same  value.  Hence,  the  same  analysis  applies  to  this  scenario. 

Second,  we  consider  the  case  where  the  target  emits  a  signal  with  the  frequency  fc  and  the  sensors  just  measure  this 
frequency  without  emitting  any  signal  of  their  own.  This  scenario  is  usually  known  as  a  passive  doppler  localization  scenario. 
In  this  case  the  noiseless  measurements  are  of  the  form 


c  _  /CVT( P-S,;) 

c  ||p  —  Si || 


(13) 


The  theory  applying  when  fc  is  independently  known  a  priori  is  effectively  the  same  as  the  one  stated  previously.  However,  if 
fc  is  unknown,  the  question  arises  as  to  whether  it  can  be  determined.  For  this  scenario,  we  show  that  there  is  no  way  from  a 
single  set  of  instantaneous  doppler  only  measurements  that  one  can  separate  fc  and  v.  We  conclude  this  section  by  formalizing 
this  fact  and  presenting  a  result  on  the  case  where  the  carrier  frequency  is  unknown. 


Proposition  4.  For  n  >  5  doppler  measurements  as  described  by  (13)  and  generic  positions  for  the  sensors,  and  unknown 
carrier  frequency  fc,  there  is  a  unique  solution  for  the  unknown  p  and  for  the  vector  /cv. 


Proof:  Define  v  =  [i/±  ^]T  —  /cv.  For  (13)  we  have 


UT(P  -  Sj) 
c||p  -  Sill 


(14) 


From  Proposition  3  we  know  that  using  five  equations  of  the  form  (14),  p  and  v  can  be  calculated  uniquely.  It  follows  easily 
that  if  one  does  neither  know  fc  nor  v  and  only  estimates  v  =  fcv  then  for  any  chosen  fc  (or  v)  there  is  a  subsequent  value 
of  v  (or  fc)  that  satisfies  v  =  /cv.  Thus,  fc  and  v  cannot  be  calculated  separately.  ■ 


Remark  2.  By  using  the  calculated  value  p  in  consecutive  time  steps,  one  can  estimate  the  velocity  of  the  agent.  Using  this 
estimated  velocity  and  knowing  v,  one  can  further  estimate  fc. 


III.  Required  Minimum  Number  of  Hybrid  Measurements 

In  this  section  we  study  the  effect  of  having  other  types  of  measurements  additional  to  doppler  shift  measurements  on 
calculating  the  velocity  and  the  position  of  the  target. 

Before  continuing  further,  we  note  that  to  have  a  unique  solution  for  a  set  of  polynomial  equations  there  is  usually  a  need 
to  have  more  equations  than  unknowns,  save  in  cases  where  the  equations  are  linear.  However,  having  more  equations  than 
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unknowns  does  not  of  itself  guarantees  the  existence  of  a  unique  solution;  the  extra  equation  must  in  some  way  be  independent, 
and  it  may  have  this  property  in  almost  all  circumstances,  i.e.  generically,  even  if  not  always.  In  this  section  we  establish 
the  cases  where  it  can  be  mathematically  shown  that  a  unique  solution  for  x  exists  when  a  combination  of  doppler  and  other 
measurements  is  available.  We  have  the  following  results: 

Proposition  5.  The  following  statements  are  true. 

1)  First,  consider  the  case  where  in  addition  to  the  doppler  shift  measurements  described  earlier,  the  distances  between 
each  of  the  sensors  and  the  target  can  be  measured  as  well.  Denote  the  distance  between  the  sensor  i  and  the  target  as 
di  where 

di  =  ||p  -  s»||,  i  =  1,  ■  •  •  ,md  (15) 

a)  For  n  >  2  doppler  measurements  as  described  by  (2)  and  md  >  3  distance  measurements  (15)  and  generic  positions 
of  the  sensors  there  is  a  unique  solution  for  the  unknown  x. 

b)  For  n  >  3  doppler  measurements  as  described  by  (2)  and  md  >  2  distance  measurements  (15)  and  generic  positions 
of  the  sensors  there  is  a  unique  solution  for  the  unknown  x. 

2)  Second,  consider  the  case  that  instead  of  distance  measurements,  bearing  measurements  to  the  target  at  each  sensor  i 
are  available,  viz. 

ipi  =  [V’i.i  ^i,2]T  =  TT- — ~77 ?  i  =  (16) 

Up -Sill 

is  known  at  each  sensor  i. 

a)  For  n  >  2  doppler  measurements  as  described  by  (2)  and  at  mb  >  2  bearing  measurements  (16)  and  generic 
positions  of  the  sensors  there  is  a  unique  solution  for  the  unknown  x. 

b)  For  n  >  4  doppler  measurements  as  described  by  (2)  and  only  m>b  =  1  measured  bearing  to  the  target  and  generic 
positions  of  the  sensors  there  is  a  unique  solution  for  the  unknown  x. 

3)  Third,  for  n  >  3  doppler  measurements  as  described  by  (2),  at  least  md  =  1  distance  measurement  (15),  at  least  mb  =  1 
measured  bearing  to  the  target  (16),  and  generic  positions  of  the  sensors  there  is  a  unique  solution  for  the  unknown  x. 

Proof:  The  proof  of  la  is  trivial. 

To  prove  lb  observe  that  the  two  distance  measurements  pin  down  the  target  to  a  binary  ambiguity;  i.e.  two  generic  circles 
have  two  points  of  intersection.  Using  each  of  these  positions,  then  from  the  doppler  equations  we  obtain  two  sets  of  three 
linear  equations  in  the  velocity  of  the  target.  Generically,  only  one  of  these  sets  forms  a  consistent  system  of  linear  equations. 

The  proof  of  2a  is  obvious. 

To  prove  2b  without  loss  of  generality  assume  that  only  sensor  1  can  collect  a  bearing  measurements  to  the  target  in  addition 
to  the  doppler  shift  measurement.  The  rest  of  the  sensors  can  only  measure  the  doppler  shift.  For  the  doppler  measurement  at 
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TABLE  I 

Minimum  Number  of  Measurements  in  Different  Scenarios  for  Unique  Target  Position  and  Velocity  Localization 


No.  of  Doppler  Measurements 

5 

2 

3 

2 

4 

3 

No.  of  Distance  Measurements 

0 

3 

2 

0 

IT 

1 

No.  of  Bearing  Measurements 

0 

0 

0 

2 

l 

~r 

1  we  have 


<5i  =  2v1>1^ 

c 

(5i  =  2(121^1,1  +  v2ipi,2) 


fc,i 


that  is  a  linear  equation  in  v.  Moreover,  we  have 


P2  =  Si, 2  +  01,2 


Pi  -  Sl,l 

-01,1 


Calculating  i>2  in  terms  of  v\  from  (17)  and  p2  in  terms  of  pi  from  (18)  and  replacing  them  in 


c  o  T  P  si  jc,i  ■  OO/I 

+  =  2v  M - IT -  *  =  2,  3, 4 

Up -^11  c 


(17) 


(18) 


(19) 


we  obtain  three  quadratic  equations  in  v±  and  p\  only.  Furthermore,  it  is  known  that  generically  three  quadratic  equations  in 
two  variables  have  a  unique  solution.  Hence,  x  can  be  determined  uniquely. 

The  proof  of  the  last  statement  is  very  similar  to  that  of  lb  and  is  omitted.  ■ 

Table  I  summarises  the  various  cases  we  have  considered.  In  the  next  section,  we  introduce  an  algorithm  to  estimate  the 
position  and  the  velocity  of  the  target  using  the  doppler-shift  measurements  measured  at  each  of  the  sensor,  where  noise  may 
contaminate  the  measurements. 


IV.  An  Algorithm  to  Estimate  the  Position  and  the  Velocity  of  the  Target 

To  calculate  the  position  and  the  velocity  of  the  target  in  the  noiseless  case  it  is  enough  to  solve  the  following  system  of 
equations. 


fi- 


v  (P  -  Si 

Up  ~s*ll 


=  0  i  =  1 , . . . ,  n 


(20) 


c6i 


where  /,  =  Equivalently  due  to  the  fact  that  the  target  and  the  sensors  are  not  collocated  (i.e.  ||p  —  s,  is  nonzero), 

Jc,i 

instead  of  solving  (20)  one  can  solve: 


/i||p-Sj||  -vT(p-Si)  =0  i  =  l,...,n, 


(21) 


/*\/(p  -  Si)T(p  -  Si)  -  V  (p-Sj)  =  0  i=l,...,n. 


(22) 
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Having  the  square-root  in  (22)  makes  it  undesirable  for  solving  numerically.  Hence,  instead  we  consider  the  following  set  of 
equations. 

fi(p  -  Si)T(P  -  -  (vT(p  -  s,))2  =  0  i  =  l,...,n.  (23) 


Note  that  any  solution  of  (22)  is  a  solution  of  (23)  but  not  vice  versa.  Assume  x*  =  [p*T  v*T]T  is  a  solution  to  (22);  then 
both  =  [p*T  v*T]T  and  x£  =  [p*T  —  v*T]T  satisfy  (23).  From  Proposition  3  we  know  that  for  n  >  5  (22)  has  a  unique 
solution.  Then  it  easily  follows  that  for  n  >  5,  (23)  has  exactly  two  solutions.  Replacing  p  by  p*  in  (20)  results  in  a  set  of 
linear  equations  in  v  where  only  one  of  the  values  of  v*  or  — v*  satisfies  it.  The  aforementioned  analysis  of  the  solutions  for 
x  in  the  noiseless  case  form  the  basis  of  the  algorithm  proposed  in  this  section. 


Now  we  consider  the  case  where  the  doppler  shift  measurement  is  corrupted  by  noise.  That  is,  each  sensor  measures 


Si  =  Si  +  Wi  where  w-i  corresponds  to  the  noise  in  the  measurement  carried  out  by  sensor  i.  Setting  /,;  = 


cSi 

2/W 


,  we  have 


vT(p-s.;) 

Up -sill 


(24) 


^  CUJ  ■ 

Note  that  ft  =  fi+u>i,  where  w,  =  *  .  In  the  noisy  case  neither  (22)  nor  (23)  have  a  solution  for  x.  Instead  we  propose  solving 

^ Jc,i 

the  following  minimization  problem  to  calculate  the  position  and  the  velocity  of  the  target.  Solution  of  similar  minimization 
problems  when  range,  range-difference,  and  bearing  measurments  are  available  are  studied  in  [13],  [14], 


[p*,v*]  =  argmin  J(p,v), 

P,V 


(25) 


n  /  „  \  2 

where  J(p,  v)  =  ( fi\\P  —  si||2  —  (vT(P  —  Si))2  )  ■  The  advantage  of  having  such  a  cost  function  is  that  it  is  a  polynomial 

in  the  unknowns,  and  can  be  minimized  using  modern  polynomial  optimization  methods,  e.g.  see  [15],  [16].  By  solving  this 
minimization  problem  we  obtain  two  values  for  the  position  and  the  velocity  of  the  target,  viz.  (p^p]])  and  (v[,  v.)),  where 
Pi  =  P2  anc*  vi  =  — v2-  To  find  the  correct  value  for  the  velocity  as  before  we  replace  p  by  p*  *  p[  =  p£  in  (24)  to  obtain 
a  set  of  linear  equations  in  v; 


vT(p*  -  Sj) 

IIp*-^!! 


(26) 


It  is  easy  to  check  that  the  linear  system  of  equations  (26)  is  over-determined  and  inconsistent,  and  cannot  be  solved.  However, 
a  least-square  solution  to  this  syetm  of  linear  equations  can  be  obtained.  The  least  squares  solution  to  this  system  of  equations 
is  the  estimated  value  for  the  target  velocity,  v*.  This  procedure  is  outlined  in  Algorithm  1. 

We  conclude  this  section  by  briefly  considering  the  maximum  likelihood  problem  of 


[PmlAml]  =  argmin  Jml( P,v), 

P,V 


(27) 


where  JMl{ P,v)  =  J2  (  /*  - 


vT(p-si;) 

Up -sdl 


.  Solving  (27)  involves  solving  a  nonlinear  least  squares  problem  that  its  accuracy 


depends  on  the  initial  guess  for  its  solution.  We  note  that,  the  output  of  Algorithm  1  can  be  used  to  initialize  the  nonlinear 
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Algorithm  1  Target  Position  and  Velocity  Estimation  Using  n  doppler-shift  Measurements  Collecting  at  Nodes  (1, . . 
a  Sensor  Network  Via  Polynomial  Optimization. 

Input:  A, . . . ,  fn 
Output:  p*,  v* 


Require:  At  least  5  sensors  at  generic  positions. 

J( P,v)  E  (/?IIP  -  s*ll2  -  (vT(P  ~  si))2) 

[p*,  v*]  ■(— argmin  J(p,v) 

P»V 

for  i  =  1  :  n  do 

7  p  —  Si 

‘  Up* -Sill 

end  for 

b  [fl,  ■  ■  ■  ,  fn]T 

v*  ■£-  A1  b  {A  is  the  pseudo-inverse  of  A.} 

return  p*  and  v* 


n )  in 


solver  that  minimizes  (27).  We  demonstrate  this  later  in  the  paper. 

After  establishing  the  minimum  number  of  doppler  measurements  necessary  to  achieve  localization,  in  the  following  sections 
we  study  the  Cramer-Rao  inequality  for  calculating  the  target  position  and  velocity.  Later,  we  propose  an  optimal  sensor 
placement  where  the  objective  is  to  estimate  the  velocity  of  the  target. 

V.  The  Cramer-Rao  Inequality 

If  I(x)  is  the  Fisher  information  matrix,  that  will  be  formally  defined  below,  then  the  Cramer-Rao  inequality  lower  bounds 
the  variance  achievable  by  an  unbiased  estimator.  For  an  unbiased  estimate  x  of  x  we  find 

E  [(x  —  x)(x  —  x)T]  >I(x)-1  (28) 

If  Z(x)  is  singular  then  (in  general)  no  unbiased  estimator  for  x  exists  with  a  finite  variance  [17],  If  (28)  holds  with  equality, 
for  some  unbiased  estimate  x,  then  the  estimator  is  called  efficient  and  the  parameter  estimate  x  is  unique  [17].  However,  even 
if  T(x)  is  non-singular  then  it  is  not  practically  guaranteed  that  an  unbiased  estimator  can  be  recognized.  Alternatively,  if  an 
unbiased  estimator  can  be  realized,  it  is  not  guaranteed  that  an  efficient  estimator  exists  [18]. 

The  condition  (28)  says  nothing  about  the  performance  and  realizability  of  biased  estimators.  That  is,  in  order  to  use  (28) 
we  must  consider  only  unbiased  estimators  [17].  The  ( i,j)th  element  of  X(x)  is  given  by  [19] 

Q  r\ 

Zi,j  (x)  =  E  —  In  (/p(p;  x))  —  In  (/g( p;  x))  (29) 

where  [pT  vT]T  =  [x\  ...  X4]  £  R4  and  /f(f;x)  is  the  Gaussian  likelihood  function.  We  then  easily  find  X(x)  = 
V  F  1  Rf  1 V  F.  The  Fisher  information  metric  characterizes  the  nature  of  the  likelihood  function.  If  the  likelihood  function  is 
sharply  peaked  then  the  true  parameter  value  is  easier  to  estimate  from  the  measurements  than  if  the  likelihood  function  is 


flatter. 
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A.  General  Results 

The  Fisher  information  matrix  is  given  by  (30). 


Z(p) 

"  1 
=  E^2 

Zi  (p) 

•  Z(v) 

i—1  ^ 

Z4  (v) 

We  use  Z(x),  Z(p)  and  Z(v)  to  denote  the  Fisher  information  matrix  defined  by  considering  only  the  parameters  x,  p  and  v 
respectively.  Both  Z(p)  and  Z(v)  turn  out  to  be  principal  sub-matrices  of  Z(x).  In  all  cases,  independent  measurements  from 
additional  sensors  in  general  positions  will  never  decrease  the  total  information  in  each  x,  p  and  v. 

Proposition  6.  The  condition  n  >  1  is  a  necessary  condition  for  Z  (x)  to  be  non-singular. 

Proof:  Recall  that  Z(x)  =  VFTRiT1VF  or  given  Rf  =  diag  (of, . . . ,  cr^)  we  have  [20] 

n 

Z(x)  =  ^-v/7v/i  (31) 

i=l  ai 

which  is  a  sum  of  matrices  each  with  rank  at  most  1.  Now  a  well-known  result  states  that  a  rank-fc  matrix  can  be  written  as 
the  sum  of  k  rank-1  matrices  but  not  fewer.  This  immediately  implies  our  result  and  completes  the  proof.  ■ 

Proposition  7.  The  following  statements  concerning  efficient  estimators  hold. 

1)  If  n  is  finite  then  no  efficient  estimator  exists  for  x. 

2)  If  p  is  blown  and  n  is  finite  then  an  efficient  estimator  for  v  exists  and  is  given  by  the  standard  linear  maximum 
likelihood  estimator. 

3)  If  v  is  known  and  n  is  finite  then  no  efficient  estimator  exists  for  p. 

Proof:  This  result  follows  from  a  general  result  concerning  efficient  estimators  given  in  Theorem  1  and  2  in  [18],  ■ 

We  do  not  consider  the  design  of  unbiased  (but  inefficient)  estimators  for  either  p  or  x  in  this  work.  However,  we  know 
that  no  unbiased  estimator  for  x  exists  with  a  finite  variance  when  n  <  4.  Similarly,  no  unbiased  estimator  for  p  or  v  exists 
with  a  finite  variance  when  n  <  2. 

B.  Discussion  on  the  Cramer-Rao  Bound 

The  Cramer-Rao  inequality  assumes  an  unbiased  estimation  algorithm  and  an  estimator  which  achieves  the  inequality  is 
called  an  efficient  estimator.  An  efficient  estimator  does  not  exist  for  x  or  p  when  n  is  finite  (For  example,  the  well-known 
maximum  likelihood  localization  techniques  are  only  unbiased  and  efficient  when  the  number  of  sensors  approaches  infinity.) 
but  does  exist  for  v  when  p  is  given.  Even  if  an  efficient  estimator  does  not  exist  then  it  may  be  possible  to  design  an  unbiased 
estimator.  This  possibility  is  not  explored  in  this  work.  In  practice,  a  system  designer  may  be  constrained  in  their  choice  of 
parameter  estimator.  Likely,  the  estimation  technique  used  in  practice  will  be  biased  [21],  [22].  The  Cramer-Rao  bound  for 
unbiased  estimators  is  still  an  interesting  benchmark  with  which  intuitively  pleasing  results  and  performance  measures  can  be 
derived.  However,  these  results  can  only  be  considered  as  a  guide. 
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It  is  interesting  that  the  variance  (or  mean-square-error)  of  an  estimate  can  sometimes  be  made  smaller  at  the  expense  of 
increasing  the  bias  [23],  The  work  of  1 24],  [25]  explores  the  concept  of  bias-variance  trade  offs  in  estimation.  In  [17],  [25]  a 
biased  Cramer-Rao  inequality  and  in  [24],  [25]  a  uniform  Cramer-Rao  inequality  are  developed  and  can  be  used  to  study  this 
so-called  bias-variance  trade  off.  These  ideas  are  yet  to  be  fully  explored  in  the  localization  and  target  tracking  literature. 


VI.  On  the  Fisher  Information  for  Velocity  Estimation 


Doppler-based  measurements  are  often  used  to  estimate  the  target  velocity  alone.  In  this  section  we  explore  the  relationship 
between  the  transmitter  and  sensor  positions  and  the  velocity  estimation  error  lower  bound  defined  by  X,;(v).  To  this  end, 
consider  the  sub-matrix 


where  cos  4>i 


l(v)  =  X 

i= 1 


cos2  (&) 

Sin2  (</>,:) 


ipi,i,  and  sin  </>,  =  ipit 2. 


(32) 


Firstly,  we  note  that  optimal  sensor  placement  for  velocity  estimation  is  equivalent  to  the  optimal  sensor  placement  for 
range-based  localization  as  outlined  in  [26].  However,  importantly,  the  optimal  sensor  placement  for  estimating  the  target 
velocity  does  not  depend  on  the  velocity  itself.  Hence,  such  placements  can  be  used  in  practice  to  estimate  the  velocity  of 
the  target  when  other  measures  are  available  for  positioning;  e.g.  we  have  already  discussed  a  number  of  hybrid  scenarios  in 
which  the  additional  measurements  provide  access  to  positioning  information  as  opposed  to  velocity. 


Moreover,  with  regards  to  the  optimal  sensor  placement  for  velocity  estimation  we  assume  the  error  variance  in  doppler 
measurements  is  independent  of  their  true  value.  While  this  is  not  unreasonable  in  many  applications,  this  assumption  can  be 
relaxed  such  that  the  variance  is  different  across  sensors.  Moreover,  we  can  even  consider  cases  in  which  the  standard  deviation 
is  multiplied  by  a  percentage  of  the  true  range  value  between  the  target  and  individual  sensors.  In  both  these  cases  the  Fisher 
information  matrix  for  velocity  estimation  becomes  very  similar  to  the  Fisher  information  matrix  for  bearing-only  localization 
and  we  point  to  [26]  for  the  details. 

We  summarize  the  result  on  the  optimal  sensor  placement  for  estimating  the  velocity  of  the  target  in  the  following  proposition. 


Proposition  8.  Let  the  angle  subtended  at  the  target  by  two  sensors  i  and  j  be  denoted  by  i)-,3  =  i):ll .  One  set  of  optimal  sensor 
placement  is  characterized  by 

&ij  =  'Oji  =  -7T  (33) 

n 

for  all  adjacent  sensor  pairs  i,j  £  {1, . . . ,  n  >  4}  with  |  j  —  i\  =  1  or  \j  —  i\  =  n  —  1,  and  then  by  a  possible  application  of  the 
following  actions  on  (33): 

1)  Changing  the  true  individual  sensor-target  ranges,  i.e.  moving  a  sensor  from  s,  to  p  +  k(s,  —  p)  for  some  k  >  0. 

2)  Reflecting  a  sensor  about  the  emitter  position,  i.e.  moving  a  sensor  from  s,  to  2p  -  s,. 


For  example.  Fig.  2  illustrates  two  optimal  sensor  placement  scenarios  obtained  from  each  other  by  reflecting  a  particular 
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sensor  about  the  emitter  position. 


Insert  Figure  2a  here 


Insert  Figure  2b  here 

Fig.  2.  This  figure  illustrates  two  scenarios  obtained  from  each  other  by  reflecting  a  particular  sensor  about  the  emitter  position.  This  reflection  does  not 
affect  the  optimality  of  the  sensor-target  configuration. 

For  more  information  on  the  proof  of  Proposition  8  and  further  details  the  reader  may  refer  to  [26]. 

VII.  Illustrative  Example 

In  this  section  we  demonstrate  the  performance  of  the  algorithm  introduced  in  Section  IV  to  estimate  the  position  and  the 
velocity  of  the  target.  We  consider  the  setting  depicted  in  Fig,  3.  We  consider  the  case  where  the  measurements  are  corrupted  by 
ten  different  levels  of  noise,  where  the  noise  is  assumed  to  be  Gaussian  with  zero  mean  and  variable  variance  (and  independent 
at  each  sensor).  The  carrier  frequency  is  assumed  to  be  constant  for  all  sensors  and  equal  to  150  MHz.  Moreover,  the  nominal 
doppler  sifts  are:  <5i  =  -9.2664,  S2  =  3.1640,  S3  =  3.1640,  S4  =  -7.8162,  S5  =  -9.4941,  S6  =  7.3782,  and  S7  =  9.1797 
all  in  Hz.  The  error  in  the  estimates  of  position  and  the  velocity  for  different  noise  levels  after  repeating  the  scenario  for 
one  hundred  times  when  six  and  seven  measurements  are  used  are  depicted  Fig.  4  respectively  (In  the  six  measurement  case, 
it  is  s7  measurements  which  are  omitted.).  Furthermore,  we  employ  a  maximum  likelihood  estimator  that  is  initialized  with 
the  solution  of  the  optimization  problem  (25)  to  estimate  the  position  and  the  velocity  of  the  target.  The  obtained  maximum 
liklihood  estimates  are  presented  in  Fig.  4.  Note,  that  in  the  simulations  considered,  when  the  estimator  is  initialized  at  a 
random  value  the  estimate  does  not  converge  to  a  value  close  to  the  solution  for  any  of  the  considered  noise  levels.  One 
notices  that  using  seven  measurements  the  maximum  liklihood  solution  when  initalized  at  the  solution  of  optimization  problem 
(25)  outperforms  the  Cramer-Rao  lower  bound.  We  conjecture  that  it  is  due  to  the  possibility  of  obtaining  a  biased  estimate 
using  the  methods  outlined  in  this  paper. 

Additionally,  note  that  the  setting  presented  in  Fig.  3  shows  elements  of  a  degenerate  geometry  as  well:  sensors  at  positions 
S2  and  S3,  and  the  target  are  collinear.  However,  here  due  to  the  presence  of  other  sensors  at  non-collinear  positions  the 
estimation  can  be  carried  out  effectively.  Due  to  this  issue,  if  one  considers  only  five  measurements  taken  by  si, . . . ,  S5  the 
localization  problem  cannot  be  solved. 

Insert  Figure  3  here 

Fig.  3.  The  setting  considered  in  the  illustrative  example.  The  squares  denote  the  position  of  the  sensors  and  the  diamond  is  the  target. 


VIII.  Concluding  Remarks  and  Future  Directions 

The  minimum  number  of  doppler  shift  measurements  necessary  to  have  a  finite  number  of  solutions  for  the  unknown  target 
position  and  velocity  is  calculated  analytically  via  algebraic  arguments.  Additionally,  we  stated  the  necessary  and  sufficient 
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Insert  Figure  4a  here 


Insert  Figure  4b  here 


Fig.  4.  The  error  in  the  estimate  of  the  position  and  the  velocity  of  the  target  using  the  proposed  polynomial  optimization  method  (Polynomial)  and  maximum 
likelihood  initialized  at  the  solution  of  the  optimization  problem  (ML-Polynomial)  after  repeating  the  scenario  100  times  with  variable  noise  variances  when 
six  and  seven  sensors  are  used. 


number  of  generic  measurements  to  have  a  unique  solution  for  the  target  parameters.  Later,  the  same  problem  has  been  studied 
where  in  addition  to  doppler  shift  measurements,  other  types  of  measurements  are  available,  e.g.  bearing  or  distance  to  the 
target  from  each  of  the  sensors.  Finally,  a  method  based  on  polynomial  optimization  is  introduced  to  calculate  an  estimate 
for  the  position  and  the  velocity  of  the  target  using  noisy  doppler  shift  measurements.  A  numerical  example  is  presented  to 
demonstrate  the  performance  of  this  algorithm. 

Moreover,  some  remarks  concerning  the  Cramer-Rao  inequality  and  its  relationship  to  the  estimation  problem  were  given. 
For  the  case  of  doppler-based  target  velocity  estimation  we  completely  characterized  the  sensor-target  geometry  and  provided 
a  number  of  conditions  on  the  optimal  placement  of  the  sensors  and  the  transmitters. 

A  possible  future  research  direction  is  to  design  a  dynamical  estimator  to  estimate  the  position  and  the  velocity  of  the 
target  measuring  the  doppler  shift  measurements  continuously.  Alternatively,  one  could  consider  the  notion  of  constraint-based 
optimization  for  localization  as  discussed  in,  e.g.,  [19],  [27]— [3 1  ] . 
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